Maria Ryskina

261 Followers
192 Following
31 Posts
Postdoc @ MIT: computational linguistics, #NLProc (now with cognition and 🧠). Organizer @ Queer in AI. She/they
Websitehttps://ryskina.github.io
Twitterhttps://twitter.com/maria_ryskina

As a linguist in #NLP -land, I sometimes hear things like, “you must hate #LLMs!” or “don’t #LLMs prove linguistics wrong?” This reasoning is based on a couple of false premises:

1. Linguistics is a set of propositions.
2. These propositions are basically those articulated by Noam Chomsky in the 1950s.

Linguistics is a field of science—science as applied to the language domain—and cannot be falsified any more than biology could be falsified. And most linguists are not Chomskyans. 1/

Exciting personal update!! I am thrilled to share that I defended my PhD 🎓 🥳

Next steps: I recently started a new position at the Allen Institute for AI as a Young Investigator.

Excited to work with Yejin Choi and the Mosaic team at AI2, as well as to join the vibrant #NLP community in Seattle. Please DM me if you are here, and would like to chat about things research-related or otherwise! 🗻✨☕️

Got my library card today! The move to Cambridge is officially complete.

Ted Chiang argues that #ChatGPT and other AI text generators essentially create blurry JPGs of all the text on the web … and covers the implications for writers.

What I like about this essay: no “Art is over” or “AI is our savior” sentiments. Just clear, lucid thinking.

https://www.newyorker.com/tech/annals-of-technology/chatgpt-is-a-blurry-jpeg-of-the-web

First human subject experiment! 🧠
Ambiguous construction of the day, East Coast edition

I'm very excited for #NormConf, a conference "about all the stuff that matters in data and machine learning, but doesn't get the spotlight."

Happening virtually this Thursday!

The talks look sooo good! 🤩
https://normconf.com/#schedule

✨ Julia Silge: How many folds is too many? Efficient simulation for everyday ML decisions

✨ Luca Belli: Geriatric data science: life after senior

✨ Lynn Cherny: NLP tips and tricks

✨ Sean Taylor: When not to use SQL

✨ Chris Albon: Don't do invisible work

Normconf

Normconf is the tech conference about all the stuff that matters in data and machine learning, but doesn't get the spotlight

🔔#hiring alert: i will be recruiting PhD students to my new lab at #UWMadison this cycle! (tentatively: the Social Interaction Lab). We are an inclusive and interdisciplinary #CogSci group at the rich intersection of #language, #socialcognition, #neuroscience, and #computation. Please boost and pass along to anyone who may be interested!

🗓️deadline: December 1

🔗 https://psych.wisc.edu/graduate-program/application-process/

👋 what we do: https://rxdhawkins.com/publications/

The time we take to read a word depends on its predictability, i.e. its surprisal. However, we only know how surprising a word is after we see it. Our new paper investigates whether we anticipate words' surprisals to allocate reading times in advance :)

Joint work with Clara Meister, Ethan Wilcox, @roger_p_levy , @rdc
Paper: https://arxiv.org/abs/2211.14301
Code: https://github.com/rycolab/anticipation-on-reading-times

On the Effect of Anticipation on Reading Times

Over the past two decades, numerous studies have demonstrated how less predictable (i.e. higher surprisal) words take more time to read. In general, these previous studies implicitly assumed the reading process to be purely responsive: readers observe a new word and allocate time to read it as required. These results, however, are also compatible with a reading time that is anticipatory: readers could, e.g., allocate time to a future word based on their expectation about it. In this work, we examine the anticipatory nature of reading by looking at how people's predictions about upcoming material influence reading times. Specifically, we test anticipation by looking at the effects of surprisal and contextual entropy on four reading-time datasets: two self-paced and two eye-tracking. In three of four datasets tested, we find that the entropy predicts reading times as well as (or better than) the surprisal. We then hypothesise four cognitive mechanisms through which the contextual entropy could impact RTs -- three of which we design experiments to analyse. Overall, our results support a view of reading that is both anticipatory and responsive.

arXiv.org

Draft #course #outline: #Subword Units for #Speech and #Language #Technologies

Module 1: #Morphology

Formal and functional operations. Typology. IA. Morphological segmentation. Lexemes and WP analysis. Lemmatization, reinflection, paradigm completion.

Module 2: #Orthography

Descriptive phonetics, IPA, and phonemes. Unicode. Typology of orthographies. Tokenization. G2P and P2G.

Module 3: #Acoustic #Phonetics

General acoustics. DSP. Acoustic analysis. Applications of phonetics.