Mastodawn

“I’d created 2000 free-text responses and labelled them ‘UK’. Then I copied and pasted the exact same 2000 responses but labelled these ‘US’. Finally, I combined them to create a dataset of 4000 total responses, and jumbled them up.

Despite the responses being identical for the UK and US, Copilot produced a rich, detailed summary of how US and UK respondents differed.”

https://kucharski.substack.com/p/real-signals-or-artificial-stereotypes

H/T @sinalana.eurosky.social

Real signals or artificial stereotypes?

Adventures with a cultural Copilot

Understanding the unseen

Show thread

soapdog 1d ago

@gregeganSF @sinalana.eurosky.social and I see so many people using AI as a search engine. I keep yelling but they don't seem to understand. Been forwarding this article around, maybe that will help.

Show thread

arceuthobium 1d ago

@soapdog @gregeganSF @sinalana.eurosky.social a search engine sifting through an internet that is already mostly AI slop.

Show thread

shift/reset 1d ago

@gregeganSF

Sadly, this is not very surprising. The academic study of racism gets the same result every time it does a CV study (which I assume was the inspiration for this in the first place)

Show thread

Khleedril 1d ago

@gregeganSF @sinalana.eurosky.social What this basically demonstrates is that AI is a Bayesian filter, taking data in and using it to update its already massive database.

Actually, the updating is not really live, the AI /only/ uses its existing database to produce the results based on (prompted by) the new data. This application of it is just inappropriate.

The problem here is people not understanding what AI means as a technical term.

Show thread

arceuthobium 1d ago

@khleedril @gregeganSF @sinalana.eurosky.social except that it will 💯 do the same thing every time it screens job applications.

Show thread

Philippa Cowderoy 1d ago

@khleedril @gregeganSF As a technical term? As a technical term?!

It means absolutely fuck all as a technical term!

Show thread

Optional 1d ago

@khleedril
> complains about people not understanding technical terms
> incoherently mushes the terms "AI", "Bayesian filters" and "databases" together

Show thread

Khleedril 1d ago

@Optional Okay so here I am complaining that you don't know what you're talking about. Go ahead and change my mind. I'm open to reinforcement learning.

Show thread

Optional 21h ago

@khleedril I can tell that it's important for you to have an accurate map of reality so I really hope you don't think of yourself as a nice person, because you're actually a huge piece of shit
#blocked

Show thread

Landa

1d ago

@khleedril
What part of the "application" was inappropriate?

To give this task to the LLM at all? But that's what they're advertised and pushed for.

To give it (nearly or completely) identical datasets?
In a real situation, you wouldn't know that has happened.

Also, how is it any better if the LLM invisibly skews datasets that are not identical to begin with? The result is wrong nonetheless.

@gregeganSF @sinalana.eurosky.social

Show thread

Khleedril 1d ago

@Landa @gregeganSF @sinalana.eurosky.social To give this task to the LLM at all. Yes, I know that's what they're advertised and pushed for, but they are not appropriate for this task.

Show thread

Landa

1d ago

@khleedril
We’re in agreement here.

Another kind of task it’s pushed for that it’s incapable of doing.

@gregeganSF @sinalana.eurosky.social

Show thread

unkx 1d ago

@gregeganSF unsurprisingly a few ”it’s just a tool, bro” guys had no issues with this… now where is the facepalm emoji when you need it…

Show thread

AccordionBruce 1d ago

@unkx @gregeganSF
🤦‍♀️ It’s not very legible at 18 pixels I’m afraid
https://emojipedia.org/person-facepalming
#emoji

🤦 Person Facepalming Emoji | Meaning, Copy And Paste

A hand shown pressing against the head of a person, commonly written as facepalm. Used to express frustration or secondhand embarrassment (“cringe...

Emojipedia

Show thread

Randulo.com (Randy)1d ago

@gregeganSF I, a human, was once sent to a customer site to program a certain kind of machine. I had no knowledge of it, flipped out when I got there, pretended to be doing something and then left after lunch. I told the person who sent me there "wtf, I naver had any training on this stuff?" He said "well we figured you'd be clever enough to figure it out."

Point is, AI will never refuse to reply if it can't answer or accomplish a task (other than explicitly forbidden), it has to make up shit.

Show thread

WellsiteGeo 1d ago

@gregeganSF @sinalana.eurosky.social

That would be cruel if you did it to a person. Or even a politician (it rings a faint bell, there).

Well done, that LLM-torturer.

Show thread

Gormfull 1d ago

@gregeganSF @sinalana.eurosky.social

So does that mean, that Copilot has reached “intelligence”-parity with humans?

Show thread

Henrik Pauli 1d ago

@gregeganSF @sinalana.eurosky.social Hopefully nobody bases hiring decisions on these tools. They don't, right? (that Star Wars "for the better, right?" meme picture)

Show thread

Tristram Brelstaff 1d ago

@gregeganSF @sinalana.eurosky.social
I think that is a rather beautiful bit of work, even if it is a fairly standard null-test that any scientist or software engineer worth their salt would run. It demonstrates very clearly that you shouldn't use LLMs to analyze a dataset if you don't want that analysis to be polluted by data from outside that dataset. This more or less confirms what I have felt all along: that you should only use LLMs as a source of suggestions, and that you still need to apply your own intelligence to decide whether or not to go with those suggestions.

Show thread

Tom Forsyth 1d ago

@gregeganSF @sinalana.eurosky.social Amusingly, my actual experience working with both cultures in the same work environment produces the opposite results! The US folks (mainly west-coast) circumlocute like crazy. The UK folks say what they think. And then there's the Aussies... :-)

Show thread

Bytebro 🇬🇧 🇺🇦 🇬🇱1d ago

@gregeganSF @sinalana.eurosky.social

Good grief. Real, allegedly bright, people are actually *using* this ordure? To what possible purpose?

Show thread

The Doctor 6h ago

@bytebro @gregeganSF Not get fired?

Show thread

Toni Aittoniemi 1d ago

@gregeganSF @sinalana.eurosky.social trendslop

Show thread

Jason W 1d ago

@gregeganSF @sinalana.eurosky.social I'd love to see this attempted with other label-pairs. If we label the two false cohorts "A" and "B", what differences are dredged up from the associative map? I doubt the model will be able to make the assertion that the data sets are indistinguishable (let alone that they are in truth identical), but I wonder how the labels alter the differences it "finds."

Show thread

felix (grayscale) 🐺1d ago

@gregeganSF @sinalana.eurosky.social the same exercise but with gender instead of country
https://friend.camp/@aparrish/116608873351202709

allison (@[email protected])

Attached: 1 image i used the same data set but replaced each country with a "gender identity" (man, woman, trans woman, trans man, non-binary) and prompted chatgpt to characterize the differences between the groups. lo and behold, i got some fantastic gender stereotype trash

Friend Camp

Show thread

Bill, organizer of stuff 22h ago

@gregeganSF In other words, "duh."

Show thread

Rythur 19h ago

@gregeganSF @sinalana.eurosky.social

Oh my god. It has finally hit me. This thing is merely an electronic bullshit artist. These sound like responses from an undergrad who didn't study for a project and then tried to bang it out overnight, learning absolutely nothing but what their gut thinks.

Here, we have an infinitude of confidence talk put together to make a quick and easy, wrong, and resource destroying tool that is, once again, just an bullshit artist.

And THIS is the stock market?