Adina Williams

351 Followers
188 Following
8 Posts
Linguistics, NLP, Cognitive Science. Research Scientist at FAIR (Meta, NYC), formerly at NYU Linguistics.
EmployerFAIR, Meta
Google Scholarhttps://scholar.google.com/citations?hl=en&user=MUtbKt0AAAAJ

LLMs' Syntax: Grammaticality uncanny valley?

Something is off about this Bingbot "utterance".
It feels maybe ambiguous between 'destroying whatever Bing wants to destroy' and 'destroying whatever Bing wants to have'?

Ellipserati pls help! Is the 'whatever' quantificational / Is this ACD?

So, who else is at #AAAI2023? Looking foward to sharing our work on modelling information change in science communication in my keynote on Saturday (16:45, ABC) @dustin
Also, Karolina will present our paper on intrinsic probing tomorrow (21:45, 149AB): https://underline.io/events/380/sessions/14503/lecture/68383-7881-a-latent-variable-model-for-intrinsic-probing ; https://arxiv.org/abs/2201.08214
Watch lectures from the best researchers.

On-demand video platform giving you access to lectures from conferences worldwide.

Underline.io

Happy to share our new paper “Language model acceptability judgements are not always robust to context” https://arxiv.org/abs/2212.08979! We prepend several kinds of context to minimal linguistic #acceptability test pairs and find #LMs (#OPT, #GPT2) can still achieve strong performance on #BLiMP & #SyntaxGym, except in some interesting cases. 🧵 [1/7]

Joint work with @jon , @kanishka, @amuuueller, @keren fuentes, @roger_p_levy, @Adinawilliams

Language model acceptability judgements are not always robust to context

Targeted syntactic evaluations of language models ask whether models show stable preferences for syntactically acceptable content over minimal-pair unacceptable inputs. Most targeted syntactic evaluation datasets ask models to make these judgements with just a single context-free sentence as input. This does not match language models' training regime, in which input sentences are always highly contextualized by the surrounding corpus. This mismatch raises an important question: how robust are models' syntactic judgements in different contexts? In this paper, we investigate the stability of language models' performance on targeted syntactic evaluations as we vary properties of the input context: the length of the context, the types of syntactic phenomena it contains, and whether or not there are violations of grammaticality. We find that model judgements are generally robust when placed in randomly sampled linguistic contexts. However, they are substantially unstable for contexts containing syntactic structures matching those in the critical test content. Among all tested models (GPT-2 and five variants of OPT), we significantly improve models' judgements by providing contexts with matching syntactic structures, and conversely significantly worsen them using unacceptable contexts with matching but violated syntactic structures. This effect is amplified by the length of the context, except for unrelated inputs. We show that these changes in model performance are not explainable by simple features matching the context and the test inputs, such as lexical overlap and dependency overlap. This sensitivity to highly specific syntactic features of the context can only be explained by the models' implicit in-context learning abilities.

arXiv.org

Genuinely infuriating story out of Northeastern University, involving spying on grad student "attendance" en masse without informing or obtaining consent: https://www.vice.com/en/article/m7gwy3/no-grad-students-analyze-hack-and-remove-under-desk-surveillance-devices-designed-to-track-them

This, I'm sorry to say, reflects an attitude I encountered **often** during my years as university faculty. The last paragraph of the piece is especially worth noting, since OF COURSE the same individuals who harbor this pernicious attitude are also the ones who regularly spread brazen anti-union propaganda. Matches my experience 100%.

(h/t @researchfairy)

‘NO’: Grad Students Analyze, Hack, and Remove Under-Desk Surveillance Devices Designed to Track Them

In October, the university quietly introduced heat sensors under desk without notifying students or seeking their consent. Students removed the devices, hacked them, and were able to force the university to stop its surveillance.

Dear colleague psychologists, Please read the preprint, "Dealing with Diversity in Psychology: Science and Ideology", by Dr. Steven Roberts, and then sign the open letter (in next toot) to ask for accountability for this racist treatment by the journal Perspectives on Psychological Science and for the EiC to resign. https://psyarxiv.com/xk4yu 1/n