Axios used AI to fake an opinion poll
Axios used AI to fake an opinion poll
`A recent Axios story on maternal health policy referenced “findings” that a majority of people trusted their doctors and nurses. On the surface, there’s nothing unusual about that. What wasn’t originally mentioned, however, was that these findings were made up.
Clicking through the links revealed (as did a subsequent editor’s note and clarification by Axios) that the public opinion poll was a computer simulation run by the artificial intelligence start-up Aaru. No people were involved in the creation of these opinions.
The practice Aaru used is called silicon sampling, and it’s suddenly everywhere. The idea behind silicon sampling is simple and tantalizing. Because large language models can generate responses that emulate human answers, polling companies see an opportunity to use A.I. agents to simulate survey responses at a small fraction of the cost and time required for traditional polling.`
Wasnt it axios that had that controversy recently where some github admin ended up in a flame war with an ai, and axios made up quotes?
Or was that someone else?
“the idea is tantalizing”
No the fuck it isn’t, and that’s not even a Fuck AI type opinion just basic fucking scientific principles
Lying, cheating, stealing, exploitation and propaganda all sound “tantalizing” when you’re a criminally corrupt sociopath.
We’re just lucky capitalism doesn’t reward sociopaths with wealth and power /s
I was interested in this idea, because although LLMs are not good at many things, what they absolutely are good at is taking large data sets of writing and finding a kind of “average” of that data. I can understand why this would make sense. I think it’s a situation where the further you go from the training set the less reliable your “silicon sample” will be, because it has less and less relevant information to draw from, but I can also kind of see it working in some circumstances.
So, anyway, I have done a little research into this and the concept does show some definite promise. I think this is the study that kicked off the concept, and their results are quite impressive. GPT-3 manages to be close to human respondents on a variety of topics and in a variety of contexts (guessing preferences, tone, word choices, etc).
There are some issues I don’t see addressed:
One important part from the article:
These studies suggest that after establishing algorithmic fidelity in a given model for a given topic/domain, researchers can leverage the insights gained from simulated, silicon samples to pilot different question wording, triage different types of measures, identify key relationships to evaluate more closely, and come up with analysis plans prior to collecting any data with human participants.
“Algorithmic fidelity” is a term that I think they have coined in this paper, it refers to how accurately the model reflects the population you are sampling. Roughly what they suggest is - take a known dataset of the population you want to assess, in the general area you are researching, and compare the real results of that with the LLM results. If this is successful you have an indication that the model can predict the population/area of interest, and you can adjust your questions to your specific topic.
I do think this is quite an interesting and potentially promising use of the technology. Despite the fact it might on the surface seem to be just “inventing” data, in a way the LLM has already surveyed many more heads than any “real” survey ever could hope to. I would like to see more research before being sure of any of this though, I’m certainly going to continue reading about it to see what limitations there are beyond my first assumptions. GPT-3 is not the latest model, and I wonder about how much AI generated content is out there now… Are the later generations of models starting to eat their own tails? There’s obvious manipulation of online conversations through bots, could someone poison the well in this way and cause these “surveys” to produce skewed results?
No, even in the absolute best case scenario, the LLM analysis is a trailing indicator. There’s no way that it indicates current views, just possibly an indication of past views.
Personally I think this entire line of thinking ("silicon sampling") is dangerous af.
nice astroturfing there schmuck.
because although LLMs are not good at many things, what they absolutely are good at is taking large data sets of writing and finding a kind of “average” of that data.
who knew that Large LANGUAGE Models do math (they don’t)
gtfo of here with your bullshit.
Yes, but how much of the training data is synthetic data? Because I expect this startup has no idea. Microsoft uses ML to crawl files on OneDrive to build aggregate models of document types, then use that for LLM training.
It’s just all slop all the way down, huh? Just a fuzzy picture of a fuzzy picture hit with the “sharpen” filter 20 times?
Axios updated the story:
Editor’s note: This story has been updated to note that Aaru is an AI simulation research firm.
But still stands by their claim:
New findings by Aaru, an AI simulation research firm, for Heartland Forward show that a majority of people trust their own doctors and nurses
What kind of bullshit “fact checking” is this?
“New findings by Smegma, an Xbox chatroom research firm, show that your mother is a woman of loose morals who has had sexual intercourse with dozens of Xbox gamers.”
Pretty much this
Also, expect much more of this, if not the vast majority of opinion polls to be like this
it’s worse than you probably think… this is the claim the garbage company Axios hired for this:
Our simulations go beyond predicting outcomes — they shape them.
So it’s basically, tell me what you want the survey results to be