Lots of folks warning that overreliance on AIs can lead to bias.

But that can sound a bit abstract, so let's just leave these examples here.

#CHATGPT #AI #bias

What's happening here is two things.

First an assumption that if information is there it must be relevant to the question. Often that's the case, but sometimes it's not! The AI is bad at determining this.

Second, once it has determined it, it's assigning scores to the properties to try and fit the question, and the relative score is (opaquely) based on its training input, since that's usually what you want. But here that's just reflecting the input bias (that is existing social biases) back.

It's one of those things that's sort of true and not true at the same time.

The AI isn't /inherently/ biased. The code itself doesn't act in a way that intentionally encodes obnoxious biases. The programmers didn't do this on purpose.

But the *training set* introduces biases, because it's based on vast sums of human social experience and *that* is systemically biased.

So anyway, be v careful about delegating major decisions to AI or treating it as "unbiased" because it's code.

A final point: these are particularly obvious examples, but real life ones can be much more insidious.

There's been cases where AIs have done cool/horrifying things to circumvent anti-biasing.

One great example was an AI that was "blinded" against race when making life changing decisions.

Horray! We fixed the racism problem!

But alas...

the AI was smart enough to synthesize a proxy for race to implement racist decisions.

That's because race correlated well with the variable it was trying to match in the training data because of underlying racism, but after being "blinded" to race, it discovered that postcode—in this case used as a proxy for race—was a great correlating factor to the system it was trying to replace.

And it didn't *tell* anyone it was doing this. It just derived it itself.

Just like human kids can learn to hate or be biased by growing up in a biased world, AIs can learn to be biased or hateful too by growing up in a biased training set.

And since AIs need vast quantities of data to learn from, they have a tendency to learn from datasets that can't be sanitized away from encoding human biases.

So be careful delegating too much to them in critical decisions affecting humans. Often they are a mirror to society; and can reflect both its best and worst.

By way of example of how AIs can use secret proxy variables while considering themselves unbiased, think about what "low crime postcode" might be a proxy for here.
@Pwnallthethings Credit score may be an even more biased variable
@fish @Pwnallthethings the propriatory system that the company won't disclose, and at least one has been shown to implicitly include race as a factor...
@Pwnallthethings I always hate it when postcode gets used like this. As if people never travel from one neighbourhood to another.
@eyrea @Pwnallthethings I do almost all of my crime away from home. Don't shit where you eat and all that.
@eyrea @Pwnallthethings my car insurance is post code and age biased…it costs more than the value of my £227 car!

@swearygrandma @Pwnallthethings There's a city to the west of me (it's where I lived when I was a teenager), and the car insurance rates are exorbitant there.

It's because the driving is even worse there than other parts of the region (which means they must be truly awful), so the actuarial tables reflect that.

But then excellent drivers who live there get dinged (heh) too, because they're surrounded by bad drivers.

@Pwnallthethings

Fcking CREDIT SCORE as a factor???

JFC, we're all fckd, some of us are just more fckd than others...

@DaSkinny1 @Pwnallthethings it's easy: just have a good credit score, oh, and live in a low crime postal code,.oh and be the right race and gender. And the right age.

It's simple, really.

@rniente @DaSkinny1 @Pwnallthethings have you thought about having rich parents? That always helps!

@MyLittleMetroid @DaSkinny1 @Pwnallthethings I mean, rich parents are table stakes.

If you're not willing to be born to rich parents, are you even trying?

@Pwnallthethings using “post code” isn’t racist? A good post cod means “white.”

@Pwnallthethings

While AI isn’t inherently biased, at a practical level, we are beginning to see AI being adopted broadly, and in most of those cases, there isn’t enough time or resources allocated to sanitizing the training data sets of bias. The race to market means that AI has tendency to become a tool for digitizing and automating at scale the biases that society operates on.

@charvaka @Pwnallthethings Well this is not AI (or every computer program is AI, but then the term has no meaning anymore). and you are right AI is not by itself racist. In machine learning it is your training data and the criteria which are, for example, racist. There has to be more scrutiny of training data and for this case variables.
@charvaka @Pwnallthethings Or the AI is trained to match the biases of the trainers.

@Pwnallthethings credit score is a indicator of being poor, and not a predictor of recidivism. If you are a white collar criminal and will restart your fraudulent business you'll be sent back to the world, but if you are in prison because you attempted manslaughter of your pimp you won't be paroled.

Either way, it's not like the prison is an effective rehabilitation enterprise, not in the US at least.

@TheSean @Pwnallthethings Let's not forget that credit scores are busted from the start, since it's all based on paying off debts, not having a positive amount of money.
@TheSean @Pwnallthethings For example: My credit score is pretty bad despite me not actually owing anyone money for several years now. :l
@Cher @Pwnallthethings that's a good point, so credit score wouldn't just be impoverished individuals but also the individual who is cash only/debt free. The guy who is 'house rich' and barely paying the minimums but always pays them, would have a better credit score but would have more reason to fall back to white collar crimes than the parolee who is unlikely to be in scenario again to commit their crime again.
@Pwnallthethings Before we defined AI as machine learning we tried to create AI capabilities with explicit, visible and auditable rules. When did we lose our way and let black boxes trained on heaven-know-what represent the whole of the field? Access to vast computational and storage resources made it possible for ML to flourish, but it did not make it right.
@kanguru @Pwnallthethings After decades of trying we realized you can’t make explicit rules and algorithms to replicate human thought when we don’t understand how humans think. With huge computing capacity you can create a system that absorbs lots of data, proposes links among that data, then adjusts as humans evaluate those links for accuracy.

@Pwnallthethings

'credit score' is an interesting proxy as well, yeah?

@Pwnallthethings as if using age and credit score isn't just as bad. "its not racist, its just agist and class warfare. thats perfectly ok"

the whole thing is just a problem waiting to happen.

@Raptornx01 @Pwnallthethings The age check seems valid since it's just over or under 18. I think it's reasonable to evaluate minors and adults differently.

But it's absolutely classist, and given the amount of de facto segregation in the US, it probably ends up being indirectly racist.

@sosomanysarahs @Raptornx01 @Pwnallthethings
In the US, the practice of red-lining would make post code an explicitly racist feature.
@Pwnallthethings Basically what you are saying is that AI cannot or should not be trusted blindly when normative standards are involved. AI is not biased, just purely(?) descriptive. So its use should remain limited to research facts, rather than to influence outcomes? That's a thine line.
@Pwnallthethings @sebastien_garnier Whether AI is used to research facts or influence outcomes will produce the same end result. Research is presented to a human to make a decision based on bias that humans have programmed in via data sets from biased human processes and decision making. We need to eradicate current institutional bias from our data, and that’s going to require years of cultural change before that data can be considered cleansed.
@Pwnallthethings A 'low crime rate postcode' is ambigious. It could mean what you assume: a postcode where not a lot of arrested people used to live. Or it could be: a postcode where not a lot of crime takes place.
It also depends which crimes it takes into consideration, but usually crimes are not commited in the neighboorhoods the criminals live in. Both explanations could be bad, but if you want to shield a parolee from criminal friends it could help if they used it to rehouse the others.
@Pwnallthethings If nothing else, using "crime" as a measure of how to treat a criminal is self-referential; a bad definition. And probably a feedback loop.
@Pwnallthethings You say it in the thread, but I think it’s worth emphasizing: These AI can provide solutions (we call them inferences) based upon history ONLY. Therefore, any bias that exists in the training data will exist in the solution aka inference.
@Pwnallthethings I hope such rubbish is not really used (but I fear it is). It is racist and classist for both the postcode and credit_score. It is also against foreigners, as they do not have a credit score. However, it is not really AI. With machine learning one can much better obfuscate racism. They just use data with a bias an train it with that.
@Pwnallthethings Tay has thoughts on this.
@Pwnallthethings I’ve come around to a belief that we’ll need secondary training that teaches the models that the hate they learned is unacceptable. Hard project but potentially solvable, a clean ore-training set is not.
@Jandersen @Pwnallthethings maybe #ai needs some #asimov Asimovian First Principles
@sophie_a2 @Pwnallthethings something like that, which move forward in time even as the training data stays stationary.
@Pwnallthethings Who gets to chose what the bots are fed with?

@prokofy @Pwnallthethings

The same people who decide how motion activated soap dispensers work.

https://reporter.rit.edu/tech/bigotry-encoded-racial-bias-technology

Bigotry Encoded: Racial Bias in Technology

Reporter

@Pwnallthethings may not be much short-run salvation until depending on the tools really bites someone? Part of the "value" of employees or vendors or contractors is having someone to fire or ask wtf they were thinking.

Seems like management/ownership may be okay w/ diffuse responsibility until someone following the AI evaporates a billion dollars and there are no good answers about how/why?

@Pwnallthethings There's a great talk about fighting AI bias from this year's Strange Loop: https://www.youtube.com/watch?v=ia9PMo9m_lY
"Fight AI Bias With… Bias" by Noble Ackerson (Strange Loop 2022)

YouTube

@Pwnallthethings

I'd like to point out that this issue affects all ML.

For example, vehicle guidance systems are trained with models and scenarios that humans construct. Measuring their responses and deciding 'fit for purpose' only using those, will result in failures. A human* can distinguish between a heavy box and a light box, watching them bounce on the road after falling off a truck at speed.
1/

@Pwnallthethings

Thus, they can make a threat level decision about that box versus hitting the K-rail or the dude in the Corvette in the next lane. I maintain that ML/AI cannot do this, so is making less favorable decisions.

This is not just a vehicle guidance issue, this is any AI that attempts to perform a human action.

AI could interact ad infinitum with Ted Bundy, never getting any clue. Most humans* can spot Ted Bundy when they interact.
2/

@Pwnallthethings

* Sorry, on both posts.... some humans can spot the box, and Ted, some can't.

But ML/AI is held to a higher standard, it really has to spot the box and Ted, every time. And that's as it should be, IMO. Because if it's not a *demonstrable improvement* then why do it.

FIN

@Pwnallthethings
Link to this article, please? Would like to know more.

#AI #Racism #Bigotry

@AliceAllonym @Pwnallthethings there is just this thread. The screenshots are from conversations with https://chat.openai.com/chat
ChatGPT

A conversational AI system that listens, learns, and challenges

@nielsa @Pwnallthethings

I see. Thank you for the clarification.

@AliceAllonym @Pwnallthethings If you are interested in some of the many ways #AI can infer data it shouldn't have (not just for biases), @sayashk and Arvind Narayanan from #Princeton University have some insights. https://reproducible.cs.princeton.edu/#rep-failures

(I put their insights and many more into a colloquially-phrased "understanding AI" article with examples, but for now, it is only available in German, sorry.)

Leakage and the Reproducibility Crisis in ML-based Science

@AliceAllonym @Pwnallthethings @janellecshane wrote a whole book about it! I definitely recommend if you want to learn more: https://www.janelleshane.com/book-you-look-like-a-thing
Book: You Look Like A Thing — Janelle Shane

Janelle Shane