“I’d created 2000 free-text responses and labelled them ‘UK’. Then I copied and pasted the exact same 2000 responses but labelled these ‘US’. Finally, I combined them to create a dataset of 4000 total responses, and jumbled them up.

Despite the responses being identical for the UK and US, Copilot produced a rich, detailed summary of how US and UK respondents differed.”

https://kucharski.substack.com/p/real-signals-or-artificial-stereotypes

H/T @sinalana.eurosky.social

Real signals or artificial stereotypes?

Adventures with a cultural Copilot

Understanding the unseen
@gregeganSF @sinalana.eurosky.social and I see so many people using AI as a search engine. I keep yelling but they don't seem to understand. Been forwarding this article around, maybe that will help.
@soapdog @gregeganSF @sinalana.eurosky.social a search engine sifting through an internet that is already mostly AI slop.

@gregeganSF

Sadly, this is not very surprising. The academic study of racism gets the same result every time it does a CV study (which I assume was the inspiration for this in the first place)

@gregeganSF @sinalana.eurosky.social What this basically demonstrates is that AI is a Bayesian filter, taking data in and using it to update its already massive database.

Actually, the updating is not really live, the AI /only/ uses its existing database to produce the results based on (prompted by) the new data. This application of it is just inappropriate.

The problem here is people not understanding what AI means as a technical term.

@khleedril @gregeganSF @sinalana.eurosky.social except that it will 💯 do the same thing every time it screens job applications.

@khleedril @gregeganSF As a technical term? As a technical term?!

It means absolutely fuck all as a technical term!

@khleedril
> complains about people not understanding technical terms
> incoherently mushes the terms "AI", "Bayesian filters" and "databases" together
@Optional Okay so here I am complaining that you don't know what you're talking about. Go ahead and change my mind. I'm open to reinforcement learning.
@khleedril I can tell that it's important for you to have an accurate map of reality so I really hope you don't think of yourself as a nice person, because you're actually a huge piece of shit
#blocked

@khleedril
What part of the "application" was inappropriate?

To give this task to the LLM at all? But that's what they're advertised and pushed for.

To give it (nearly or completely) identical datasets?
In a real situation, you wouldn't know that has happened.

Also, how is it any better if the LLM invisibly skews datasets that are not identical to begin with? The result is wrong nonetheless.

@gregeganSF @sinalana.eurosky.social

@Landa @gregeganSF @sinalana.eurosky.social To give this task to the LLM at all. Yes, I know that's what they're advertised and pushed for, but they are not appropriate for this task.

@khleedril
We’re in agreement here.

Another kind of task it’s pushed for that it’s incapable of doing.

@gregeganSF @sinalana.eurosky.social

@gregeganSF unsurprisingly a few ”it’s just a tool, bro” guys had no issues with this… now where is the facepalm emoji when you need it…
@unkx @gregeganSF
🤦‍♀️ It’s not very legible at 18 pixels I’m afraid
https://emojipedia.org/person-facepalming
#emoji
🤦 Person Facepalming Emoji | Meaning, Copy And Paste

A hand shown pressing against the head of a person, commonly written as facepalm. Used to express frustration or secondhand embarrassment (“cringe...

Emojipedia

@gregeganSF I, a human, was once sent to a customer site to program a certain kind of machine. I had no knowledge of it, flipped out when I got there, pretended to be doing something and then left after lunch. I told the person who sent me there "wtf, I naver had any training on this stuff?" He said "well we figured you'd be clever enough to figure it out."

Point is, AI will never refuse to reply if it can't answer or accomplish a task (other than explicitly forbidden), it has to make up shit.

@gregeganSF @sinalana.eurosky.social

That would be cruel if you did it to a person. Or even a politician (it rings a faint bell, there).

Well done, that LLM-torturer.

@gregeganSF @sinalana.eurosky.social

So does that mean, that Copilot has reached “intelligence”-parity with humans?

@gregeganSF @sinalana.eurosky.social Hopefully nobody bases hiring decisions on these tools. They don't, right? (that Star Wars "for the better, right?" meme picture)
@gregeganSF @sinalana.eurosky.social
I think that is a rather beautiful bit of work, even if it is a fairly standard null-test that any scientist or software engineer worth their salt would run. It demonstrates very clearly that you shouldn't use LLMs to analyze a dataset if you don't want that analysis to be polluted by data from outside that dataset. This more or less confirms what I have felt all along: that you should only use LLMs as a source of suggestions, and that you still need to apply your own intelligence to decide whether or not to go with those suggestions.
@gregeganSF @sinalana.eurosky.social Amusingly, my actual experience working with both cultures in the same work environment produces the opposite results! The US folks (mainly west-coast) circumlocute like crazy. The UK folks say what they think. And then there's the Aussies... :-)

@gregeganSF @sinalana.eurosky.social

Good grief. Real, allegedly bright, people are actually *using* this ordure? To what possible purpose?

@gregeganSF @sinalana.eurosky.social I'd love to see this attempted with other label-pairs. If we label the two false cohorts "A" and "B", what differences are dredged up from the associative map? I doubt the model will be able to make the assertion that the data sets are indistinguishable (let alone that they are in truth identical), but I wonder how the labels alter the differences it "finds."
allison (@[email protected])

Attached: 1 image i used the same data set but replaced each country with a "gender identity" (man, woman, trans woman, trans man, non-binary) and prompted chatgpt to characterize the differences between the groups. lo and behold, i got some fantastic gender stereotype trash

Friend Camp

@gregeganSF @sinalana.eurosky.social

Oh my god. It has finally hit me. This thing is merely an electronic bullshit artist. These sound like responses from an undergrad who didn't study for a project and then tried to bang it out overnight, learning absolutely nothing but what their gut thinks.

Here, we have an infinitude of confidence talk put together to make a quick and easy, wrong, and resource destroying tool that is, once again, just an bullshit artist.

And THIS is the stock market?