jackson petty

@jowenpetty
33 Followers
117 Following
30 Posts
the passionate shepherd, to his love • ἀρετῇ • מנא הני מילי
websitehttps://jacksonpetty.org
twitterhttps://x.com/jowenpetty
blueskyhttps://bsky.app/profile/jacksonpetty.org
Are you coming to EMNLP 2024? Come say hi! I’ll be presenting this work at BlackboxNLP on Friday, and will be around all week to chat! DM to meet up, let’s grab lunch!
@KiwiHellenist regarding your video on Odyssey 1.44--95, what is the (potential?) connection between Phoebus and foxes?
We now turn to live coverage of the @nytimes editors’ room
@caseyliss do you know how to file a bug report against Apple’s fonts (like San Francisco)? I’ve no idea what “problem area” should be selected on the Feedback…”Core Text”?
Hello NLP researchers around the globe! All ACL major conferences (@aclmeeting, @eaclmeeting, @aaclmeeting, and @emnlpmeeting) now have an account here. Please spread it word! #NLPRoc
@atpfm @siracusa isn’t Apple’s lack of Blink the only thing standing between us and a Chrome-monopoly dystopia where every iOS app is Electron bloatware? At least the status quo requires web developers to not completely ignore mobile Safari.

How do large language models like #ChatGPT respond to ‘questionable’ questions like “When did Mark Zuckerberg invent Google?” How can we tell?

A new paper by @najoung, Phu Mon Htut, Sam Bowman, and me explores this!

📜 https://arxiv.org/abs/2212.10003

🧵 https://sigmoid.social/@najoung/109549817894210280

(QA)$^2$: Question Answering with Questionable Assumptions

Naturally occurring information-seeking questions often contain questionable assumptions -- assumptions that are false or unverifiable. Questions containing questionable assumptions are challenging because they require a distinct answer strategy that deviates from typical answers for information-seeking questions. For instance, the question "When did Marie Curie discover Uranium?" cannot be answered as a typical "when" question without addressing the false assumption "Marie Curie discovered Uranium". In this work, we propose (QA)$^2$ (Question Answering with Questionable Assumptions), an open-domain evaluation dataset consisting of naturally occurring search engine queries that may or may not contain questionable assumptions. To be successful on (QA)$^2$, systems must be able to detect questionable assumptions and also be able to produce adequate responses for both typical information-seeking questions and ones with questionable assumptions. Through human rater acceptability on end-to-end QA with (QA)$^2$, we find that current models do struggle with handling questionable assumptions, leaving substantial headroom for progress.

arXiv.org

🦷 Another preprint 🦷
Information-seeking Qs often contain questionable assumptions that models should be robust to. "When did Marie Curie discover Uranium?" is an example. We propose (QA)^2, a test set evaluating the capacity to handle such Qs. (1/n)

https://arxiv.org/abs/2212.10003

(QA)$^2$: Question Answering with Questionable Assumptions

Naturally occurring information-seeking questions often contain questionable assumptions -- assumptions that are false or unverifiable. Questions containing questionable assumptions are challenging because they require a distinct answer strategy that deviates from typical answers for information-seeking questions. For instance, the question "When did Marie Curie discover Uranium?" cannot be answered as a typical "when" question without addressing the false assumption "Marie Curie discovered Uranium". In this work, we propose (QA)$^2$ (Question Answering with Questionable Assumptions), an open-domain evaluation dataset consisting of naturally occurring search engine queries that may or may not contain questionable assumptions. To be successful on (QA)$^2$, systems must be able to detect questionable assumptions and also be able to produce adequate responses for both typical information-seeking questions and ones with questionable assumptions. Through human rater acceptability on end-to-end QA with (QA)$^2$, we find that current models do struggle with handling questionable assumptions, leaving substantial headroom for progress.

arXiv.org
@atpfm when can we expect @marcoarment to announce “Overcast Talk”, a ground-breaking new feature that uses the power of the neural engine to strip out the vocal tracks from podcasts, allowing users to “talk along” to their favorite shows?
Is the Hebrew זקף ‘to straighten’ related to the Arabic root ṯ-q-f, as in ثقافة ‘culture’? I recall learning once that the Arabic root also originally had a sense of straightening (eg an arrow shaft) but I would expect שׁ initially, not ז from PS *ṯqf