Mastodawn

I am begging AI researchers trying to study human impact to get very rapidly better at methodology so I don't constantly read halfway through these papers only to find some ridiculous experiment design that will throw the conclusions into the air.

Show thread

Taggart

1d ago

I've been burned so many times; I've learned my lesson. You really need to read each of these things carefully if you want to understand what the researchers are concluding. Reading a news article—even worse, just the headline—is at best no information, at worst disinformation.

Show thread

Taggart

1d ago

The paper in question today is one from an Ars article that I won't link to prevent hype.

But reading this thing is a journey. From inventing a new classification of cognition to entirely abstract experiment design for "Brain only" and "AI Use" control/experimental groups, the conclusions can't be taken seriously. They feel "truthy," but that's all they can be

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6097646

Show thread

Glyph 1d ago

@mttaggart ugh thanks for saving me some time

Show thread

Niels Abildgaard 1d ago

@mttaggart I mean... the base assumption in the tri-system seems... unlikely to hold imo. Just from the abstract: "System 3 [AI use] can supplement or supplant internal processes, introducing novel cognitive pathways."

That seems like a very generous interpretation of LLM use, that places it in a special category outside of other tool use, which is already covered by systems 1 and 2...

Show thread

Taggart

1d ago

@nielsa Yeah I didn't buy offloading as a novel pathway on spec.

Show thread

goatcheese 1d ago

@mttaggart I didn't read the paper, but the moment the Ars article mentioned "fluid IQ" is when I thought "oh no, this is gonna be hogwash huh?".

Thanks for reading it and confirming 😮‍💨

Show thread

felix (grayscale) 🐺1d ago

@mttaggart the study should have included System 4 "phone a friend" and System 5 "ask the audience"

Show thread

Gabriel N 1d ago

@mttaggart it is very difficult to go above the hype and blatant disinformation on the subject. Have you listened to this podcast?

The authors are great.

https://www.dair-institute.org/maiht3k/

The Mystery AI Hype Theater 3000 Podcast

Our biweekly podcast deflates AI hype and draws attention to the real harms of the automation technologies we call "artificial intelligence".

DAIR (Distributed AI Research Institute)

Show thread

Taggart

1d ago

@wtrmt Yep, I'm a DAIR superfan.

Show thread

Ian Campbell 🏴1d ago

@mttaggart I do so love how AI "research" is comfortable with AI interpreting the results of any data collection.

such a circle jerk.

Show thread

Ian Campbell 🏴1d ago

@mttaggart "we studied the effect of AI on Y. then because there's not hundreds of years of history in how to analyze results, we had an LLM do it...SCIENCE!"

Show thread

Taggart

1d ago

@neurovagrant That's kind of a separate issue from what I'm describing. Good faith behavioral researchers just aren't thinking through what they're asking of participants, or what the design means. But I agree that practice sucks.

Show thread

Oliver D. Reithmaier 1d ago

@mttaggart that's what you get when you let CS people do human related stuff. Have you read usable security papers? I live this nightmare.

(Granted, lots of social scientists also have crappy methodological education, but at least they have _some_)

Show thread

Taggart

1d ago

@odr_k4tana This toot brought to you by MBAs, but same thing

Show thread

Oliver D. Reithmaier 1d ago

@mttaggart ugh

Show thread

Dio9sys 1d ago

@mttaggart
It's so annoying. The study you linked has soooo many problems:

1. small sample size
2. online replication with an even smaller sample size, not included in the results for reasons unspecified
3. study questions were not published, so they cannot be reviewed
4. declaring that "cognitive surrender" is different from cognitive offloading with no formal explanation of how and why
5. does not talk about research in 2011 and 2018 regarding the "google effect" of reduced working memory when you know a search engine is available, something you would think woukd be obvious prior work to cite
6. does not do a study group of JUST chatbots taking the test to compare against the human + chatbot group
7. subjective self assessment
8. "fast," "medium" and "slow" used without definitions

and that's just from skimming it...

Show thread

Ölbaum 4h ago

@mttaggart Right? It’s almost as if the whole field was bullshit.