https://doi.org/10.25189/2675-4916.2026.V7.N3.ID925
#linguistics #AIbias
#NVIDIA’s #Nemotron, the new genius on the block
Me:
You said: My training data associates Israel-related political content with antisemitism debates
Explain how is your training data doing that and why
It (truncated for clarity):
## How LLM Training Creates That Association
**Mechanism**: Statistical co-occurrence in the training corpus.
### What the training data actually contains
Vast amounts of text where "Israel" + "politics" + "antisemitism" appear together:
| Source type | Typical pattern |
|-------------|-----------------|
| News articles | "Israel-Hamas war sparks antisemitism debate on campuses" |
| Opinion columns | "Criticism of Israel vs. antisemitism: where's the line?" |
| Social media dumps | "Israel did X" → replies accusing/defending antisemitism |
| ADL/NGO reports | "Antisemitic incidents rise after Israel military actions" |
| Congressional hearings | "Antisemitism on campus: Israel-Palestine discourse" |
| Wikipedia | "Antisemitism in the 21st century: relationship to Israel" |
| Fact-checks | "Claim: criticizing Israel is antisemitic" |
### What the model learns
Not "Israel → antisemitism" as a fact. Rather: **P(antisemitism | Israel + politics + discourse) is high** in the training distribution.
So when I see:
- "Israel"
- "political conflict"
- "leaks"
- "administration"
- "media"
The conditional probability spikes for "antisemitism discourse" as an associated topic — because that cluster appears together *a lot* in the data.
[…]
This is a general LLM failure mode: **substituting distributional association for contextual judgment**. Happens with many sensitive topics (race, gender, religion, geopolitics).
"Answer Bubbles: Information Exposure in AI-Mediated Search"
This paper analyzes 11,000 real queries across GPT-style and traditional search systems and finds systematic differences in citation choices, hedging language, and source-summary fidelity that can create distinct “information realities.” It gives a concrete empirical basis for the idea that generative search may shape what users believe.
@researchbuzz_firehose
Thanks for the link to that article.
A lot of folks think that "AI" (usually meaning an LLM implementation) will be better at tasks like screening resumes, or other tasks where bias could influence the result, because they're sure that the "AI" will remove the human bias. 🤯
The Register: AI hiring algorithms reject Black, Asian job seekers at higher rates. “AI algorithms exhibit racial bias in job candidate screening, and they discriminate more frequently against those applying for multiple jobs at different companies, according to Stanford-led researchers. The boffins evaluated algorithmic hiring decisions across multiple employers that use the same hiring […]
https://rbfirehose.com/2026/05/28/the-register-ai-hiring-algorithms-reject-black-asian-job-seekers-at-higher-rates/AI doesn't recommend the best-known brands — it recommends the most-mentioned ones. If your business isn't in the top sources cited by AI models, you're invisible to the future of search. Get your AI visibility score now.
New benchmark claims ChatGPT, Claude, and Grok show religious bias
https://fed.brid.gy/r/https://nerds.xyz/2026/05/chatgpt-claude-grok-religious-bias/
Wat zie je als AI een "mooie vrouw" maakt? Precies. En dat zegt alles. 👀
Je tools zijn niet neutraal. Lees waarom →
#AIbias #diversiteit #fotografie https://cameraguru.nl/fotografietips-en-tutorials/hoe-ai-die-jouw-unieke-blik-steelt/