Mastodawn

Carl T. Bergstrom Aug 13, 2024

Taylorism is a management philosophy based on using scientific optimization to maximize labor productivity and economic efficiency.

Here's the result of making the false Taylorist assumption that the output of scientific research is scientific papers—the more, faster, and cheaper, the better.

Show thread

Carl T. Bergstrom Aug 13, 2024

Papers are not the output of scientific research in the way that cars are the output of automobile manufacturing.

Papers are merely a vehicle through which a portion of the output of research is shared.

We confuse the two at our peril.
‪
The entire idea of outsourcing the scientific ecosystem to LLMs — as described below — is a concept error that I can scarcely begin to get my head around.

sakana.ai/ai-scientist/

Show thread

Carl T. Bergstrom Aug 13, 2024

"While there are still occasional flaws in the papers produced by this first version..."

Meanwhile the authors note that the output itself fails to meet standards of scientific rigor, but treat this as a minor wrinkle, not a fundamental barrier imposed by using the wrong tool for the wrong job.

Show thread

Carl T. Bergstrom Aug 13, 2024

This system literally fabricates its methods section — an act which goes beyond bad science into the realm of serious scientific misconduct. This is more than a wrinkle to be ironed out.

Show thread

Carl T. Bergstrom Aug 13, 2024

Scientists: We need to slow down the publication race and produce higher quality papers at a slower rate to make the literature manageable again.

Engineers: We hear you. Now every lab in the world will be able to produce hundreds of medium-quality papers (with a few mistakes in each) every week.

Show thread

Carl T. Bergstrom

I do appreciate the authors' candor in detailing failure modes.

A system that makes difficult-to-catch mistakes in implementation, fails to compare quantitative data appropriately, and fabricates entire results—maybe I have high standards but I don't see this as writing "medium-quality" papers.

Show thread

Carl T. Bergstrom Aug 13, 2024

Here's the weird Taylorism again. The system produces work at the level of an early trainee requiring substantive supervision. This is not good ROI for producing papers.

The primary output of time invested in trainee research is the development of independent scientists—not the research papers.

Show thread

Carl T. Bergstrom Aug 13, 2024

In the end, how one judges this paper probably comes down to how one assesses the claim that is always used to justify this kind of work.

The authors "believe" that future versions will be
greatly improved.

Given what I know of fundamental limits to what LLMs can do, I see no reason to agree.

When I fail to do something, I either don't publish or very occasionally I publish describing that failure. When I do so, I don't pretend it was a success and promise that it'll magically get better.

Show thread

skua Aug 13, 2024

@ct_bergstrom
Is the AI scientist is being categorised, by the language used, as a responsible entity?

Is there an explicit statement that a human has been given authority over, and responsibility for, the actions of the AI scientist.

#AIWhereIsResponsibility

EDIT: I answered my own question.
Neither "responsibility" nor "responsible" are found in the paper.

smh

arxiv.org/abs/2408.06292

Show thread

jz.tusk Aug 13, 2024

@ct_bergstrom

I appreciate your full analysis, but honestly, if I were reading the paper, I would have stopped once I realized the name was always set in small caps with some funky spacing. It reeks of marketing.

Show thread

Coprolite9000 Aug 14, 2024

@ct_bergstrom
> 'The authors "believe" that future versions will be greatly improved.'

I'm wondering if there's a name for this effect - where a machine does something resembling a human activity, and people ascribe it further human qualities - then assume it will advance in ability in a similar manner to a human.

See also: self-driving cars. Shuffling mindlessly around a car park? Great! It'll soon be driving perfectly.

Or not.

Show thread

Bornach Aug 14, 2024

@coprolite9000 @ct_bergstrom
You mean The Eliza Effect?

Show thread

Ed W8EMV

Aug 14, 2024

@bornach @coprolite9000 @ct_bergstrom

> The Eliza Effect

How does that make you feel?

Show thread

Fabian Transchel Aug 14, 2024

@coprolite9000 @ct_bergstrom I believe the closest we have is a mixture of automation bias and cargo cult science.

Show thread

Jon Spring Aug 14, 2024

@coprolite9000 @ct_bergstrom I agree skepticism is in order about the pace and extent of progress. But this expectation seems reasonable to me in the abstract if we are iterating and can always use the best model to date.
For example, I can’t imagine how computers could ever get worse at playing chess than they are now, and as long as we iterate new approaches, there’s a possibility of improvement.

Show thread

FeralRobots Aug 15, 2024

@Spring @coprolite9000 @ct_bergstrom
FWIW it's quite common for performance of machine learning systems to degrade rather than improve. Ensuring that performance DOES improve typically takes a lot of human work.

Show thread

Bornach Aug 15, 2024

@FeralRobots @Spring @coprolite9000 @ct_bergstrom
And then there is the problem of an AI that has defeated the top-level human players is somehow still beatable using a strategy that only an intermediate level player would consider deploying

https://youtu.be/l7tWoPk25yU

https://arstechnica.com/information-technology/2023/02/man-beats-machine-at-go-in-human-victory-over-ai/

- YouTube

Enjoy the videos and music you love, upload original content, and share it all with friends, family, and the world on YouTube.

Show thread

Robotistry Aug 15, 2024

@coprolite9000 @ct_bergstrom
In #Robotics, it's (reasonably) well-known that people will ascribe human cognitive abilities and intent to absolutely anything.

Roboticists are particularly bad about this (see linked paper below).

I think the optimism that it will get better is not just universal but required for publication and grant approval.

Link: https://www.sciencedirect.com/science/article/abs/pii/S0921889013001863

Perception of own and robot engagement in human–robot interactions and their dependence on robotics knowledge

Communication between socially assistive robots and humans might be facilitated by intuitively understandable mechanisms. To investigate the effects o…

Show thread

Dianne Patterson Aug 16, 2024

@coprolite9000 @ct_bergstrom anthropomorphism

Show thread

Gemma ⭐️🔰🇺🇸 🇵🇭 🎐Aug 24, 2024

@coprolite9000 @ct_bergstrom https://en.wiktionary.org/wiki/anthropomorphization

anthropomorphization - Wiktionary, the free dictionary

Wiktionary

Show thread

Jonas Aug 24, 2024

@ct_bergstrom @coprolite9000 You’re thinking about “first step fallacies.” Hubert Dreyfus wrote about it some time ago https://link.springer.com/article/10.1007/s11023-012-9276-0

A History of First Step Fallacies - Minds and Machines

In the 1960s, without realizing it, AI researchers were hard at work finding the features, rules, and representations needed for turning rationalist philosophy into a research program, and by so doing AI researchers condemned their enterprise to failure. About the same time, a logician, Yehoshua Bar-Hillel, pointed out that AI optimism was based on what he called the “first step fallacy”. First step thinking has the idea of a successful last step built in. Limited early success, however, is not a valid basis for predicting the ultimate success of one’s project. Climbing a hill should not give one any assurance that if he keeps going he will reach the sky. Perhaps one may have overlooked some serious problem lying ahead. There is, in fact, no reason to think that we are making progress towards AI or, indeed, that AI is even possible, in which case claiming incremental progress towards it would make no sense. In current excited waiting for the singularity, religion and technology converge. Hard headed materialists desperately yearn for a world where our bodies no longer have to grow old and die. They will be transformed into information, like Google digitizes old books, and we will achieve the promise of eternal life. As an existential philosopher, however, I suggest that we may have to overcome the desperate desire to digitalize our bodies so as to achieve immortality, and, instead, face up to and maybe even enjoy our embodied finitude.

SpringerLink

Show thread

feliz Aug 14, 2024

@ct_bergstrom

This proposal reveals a blatant lack of understanding of how LLMs work.

https://chaosfem.tw/@theogrin/112957246097616591

Jennifer Kayla | Theogrin 🦊 (@[email protected])

Just a reminder that LLMs have never provided actual answers to any question asked of them or any actual prompt set forth. They have, however, provided *answer-shaped responses*, and we as humans are seriously lacking when it comes to telling one from the other. Folks, a banana-shaped piece of wood is not edible, and I am embarrassed for our species that you cannot tell the difference.

Chaosfem

Show thread

arceuthobium Aug 24, 2024

@feliz @ct_bergstrom or, they understand LLMs just fine but see no problem with replacing science with plausible-sounding bullshit.

Show thread

🇳𝗮ꜟ𝖼𝘩 Aug 14, 2024

@ct_bergstrom
They seem to want a future where AI produces reams of garbage, while scientists are only there to load paper into the printers.

Show thread

EssKah Aug 14, 2024

@ct_bergstrom thanks for this thread. From an outsiders perspective (not an expert on llm and as an architect far from scientific writing) this looks like a bland and lazy scam.
I am bewildered by those trying to sell aproximation machines to do specific specialized tasks, forgetting or waving off that all communication, from scientific paper to any form of art, is only really relevant because of the humans who made it, their circumstances, experience, collaboration, etc.

Show thread

Joan Combs Durso Aug 14, 2024

@ct_bergstrom and the judges will be the already-strained academics doing unpaid peer review in an already-strained system. The big commercial publishers will be able to afford the detection tools and learn to use them effectively, maybe. The little journals will be swamped.

Science fiction short story publisher Clarkesworld almost went under from the burden of fraudulent LLM-output submissions. https://neil-clarke.com/a-concerning-trend/
Is academic publishing ready?

A Concerning Trend – Neil Clarke

Show thread

Jeremy Kahn Aug 14, 2024

@econoprof

I know this is a rhetorical question

But the answer is "no", in case anyone reading wasn't sure
@ct_bergstrom

Show thread

Erik Jonker Aug 14, 2024

@ct_bergstrom ...besides all your valid points, what remains is that the world will be flooded with mediocre paper/content in various disciplines, good enough to pass a superficial check but flawed enough to do harm. AI algorithms will play a role in writing papers, reports etcetera, but how to do that in a sensible and responsible way remains a question for me. Another demo in this field is https://storm.genie.stanford.edu/

Show thread

Marc Vaudel Aug 14, 2024

@ct_bergstrom This is just a pale copy of https://tylervigen.com/spurious-scholar

Spurious Scholar

Spurious research papers based on real correlations with p < 0.05, generated by a large language model.

Show thread

FeralRobots Aug 15, 2024

@mvaudel @ct_bergstrom
Vigen's reminding me of Alfred Kroeber's infamous paper mapping the correlation between stock market performance and dress hemlines - which while absolutely satire, has nevertheless been taken with deadly seriousness in the century or so since.

Show thread

Bornach Aug 15, 2024

@FeralRobots @mvaudel @ct_bergstrom
Just wait until they learn about the zombie fish
https://law.stanford.edu/2009/09/18/what-a-dead-salmon-reminds-us-about-fmri-analysis/

What a dead salmon reminds us about fMRI analysis | Stanford Law School

This has been making the rounds in the neuroscience world, but deserves attention in cross-disciplinary fields. A group of top-notch fMRI researcher

Stanford Law School

Show thread

Bill Seitz Aug 15, 2024

@ct_bergstrom More research (funding) needed.

Show thread

Bill Seitz Aug 15, 2024

@ct_bergstrom Turning papers into (green) paper.

Show thread

Bill Hooker Aug 13, 2024

@ct_bergstrom You spend your life trying to do the best science you can, and along comes this... abomination. Makes me sick.

Show thread

Cassandrich Aug 14, 2024

@ct_bergstrom And what's worse, the only way in which it will get "better" is better at concealing its academic misconduct.

Show thread

Benjamin Sonntag-King 🐙Aug 14, 2024

@dalias @ct_bergstrom
This looks a lot like human behavior to me ...

Show thread

Cassandrich Aug 14, 2024

@vincib @ct_bergstrom Maybe but the difference is that humans face consequences.

(Billionaires are not humans, btw)

Show thread

Ian Sudbery Aug 14, 2024

@ct_bergstrom While I personally agree with you that

"The primary output of time invested in trainee research is the development of independent scientists"

I think that you'll probably find that this is a minority view, both amongst those doing the training and those funding it, at least in some fields.

I tend to find that even the most progressive funders and supervisors believe that "PhD students are the backbone of the research workforce".

Show thread

Viktoria Pammer-Schindler Aug 14, 2024

@ct_bergstrom Thanks a lot for making explicit many aspects of what is - to say the least - worrying about this work.

A similar aspect to the point you make about "developing independent scientists", and related to many points that @emilymbender is frequently making, stems from the perspective of human competence: doing research is how individual researchers learn *themselves* about the domain, doing research etc. There is no way of letting someone else do A + then being able to do A yourself.

Show thread

Viktoria Pammer-Schindler Aug 14, 2024

@ct_bergstrom @emilymbender

As a society, then I'd argue that we want _people_ who can do research. 1) Because of their singularly human way of experiencing the world (as opposed to e.g., how bats or AIs ... etc. experience the world) and 2) Because humans are also different we'd like to have diversity in human researchers, not just the single human researcher who can still do it.

Show thread

jadonn Aug 14, 2024

@ct_bergstrom Maybe I'm reading too much into this, but this makes me worried about these folks' opinions of others' work. Like do they think the quality of papers in general is so bad that what sounds rather objectively bad ranks as "medium-quality"?

I grant I'm not an academic or in the habit of reading scholarly papers, and I recognize the quality of papers in general could be poor

Show thread

Cassandrich Aug 14, 2024

@jadonn @ct_bergstrom Yes. Because that's the quality of shit they got by with in college because they were legacy admissions and members of the right frats. They like have utterly no idea that there's such a thing as legitimate work and that most people aim to do that rather than maximizing bullshit.

Show thread

A land fit for all our futures Aug 14, 2024

@ct_bergstrom I came into this thread at this toot and thought #TheAIScientist was the average person working in #AI

In the authors' defense, research in AI is not scientifically rigorous and makes a great deal of unsubstantiated claims (such as by using heat maps to infer meaning into what parts of a computer vision model are responding to)

Perhaps The AI Scientist has simply learned from the people training it?

Show thread

A land fit for all our futures Aug 14, 2024

@ct_bergstrom sorry, this is phrased unfairly as there are many people working in "AI" whom I respect greatly. However, I'm not going to rephrase my original toot, because it was my initial thought, and I'm still exhausted by all the #AIGuff that is produced, daily.

Show thread

Alastair Temple Aug 14, 2024

@ct_bergstrom I would agree that this is no where near "medium-quality". It doesn't even reach the lowest rung on the quality ladder imo.

Show thread

🔮 oracle of dylphi :crumb_dancing: 🇬🇾Aug 14, 2024

@ct_bergstrom That last bit about using it to generate promising ideas is quite sad. I thought that was what talking to other scientists was for!

Show thread

Fabian Transchel Aug 14, 2024

@ct_bergstrom There shouldn't really be "medium-quality papers" anyway: If your research is worthwhile, methodologically sound and rigorous, it's world class science, period.

If it fails to meet any of the standard, it's rubbish. In the world of h-indices and publish or perish, we are - intentionally or not - reducing "research" to cargo-culting.

I deeply despise every part of it.

Show thread

arceuthobium Aug 24, 2024

@ftranschel @ct_bergstrom indeed! Publication quantity is a flawed metric of the merit of research, and the potential for an LLM to game this system is a further condemnation of the system, not a triumph for AI.

Show thread

crab Aug 14, 2024

@ct_bergstrom well, you see, soon these papers will be of average quality by virtue of sheer volume.

Show thread

refraction

Aug 24, 2024

@ct_bergstrom one can only imagine what a poor-quality paper would be in their view