Also halten wir fest: LLMs aktueller Bauart werden immer wieder halluzinieren und - wer hätte das gedacht außer alle mit normalem Verständnis von Vertrauen in irgendwelche Daten - lassen sich sehr einfach vergiften ☠️

https://www.heise.de/news/Data-Poisoning-bei-LLMs-Feste-Zahl-Gift-Dokumente-reicht-fuer-Angriff-10764834.html

Data Poisoning bei LLMs: Feste Zahl Gift-Dokumente reicht für Angriff

Eine neue Studie widerlegt eine alte Sicherheitsannahme. Nicht der prozentuale Anteil, sondern eine geringe, feste Anzahl an Gift-Daten kompromittiert LLMs.

heise online

'Data Poisoning' kannte ich noch nicht. Gibts schon digitale Freiheitskämpfer die das gegen LLM einsetzen? Bzw. braucht es die ja gar nicht, geschieht ja auch so systemimmanent schon.

Gibts eine Bezeichnung (Wort) für ein sich selbst zerstörendes System? (So wie bei Kapitalismus z.B.)

KI = Kranke Informationstechnologie 🙃
AI = Anfällige Informationstechnologie 🤔

#llm #ki #ai #anthropic #data #datapoisoning

@synapsenkitzler How about "MAD"?
@Katecrawford recently wrote: "...#AI systems degenerate when they are fed on too much of their own outputs—a phenomenon researchers call #MAD (Model Autophagy Disease). In other words, AI will eat itself, then gradually collapse into nonsense and noise. It happens slowly at first, then all at once. The researchers compare it to mad cow disease." https://www.e-flux.com/architecture/intensification/6782975/eating-the-future-the-metabolic-logic-of-ai-slop
btw: really recommend her book "Atlas of AI" https://katecrawford.net/atlas
Intensification - Kate Crawford - Eating the Future: The Metabolic Logic of AI Slop

AI slop isn’t invested in the order of events or even looking like reality. The slop is not the territory: it just smothers it in synthetic goop. It’s flooding the zone with AI shit.

e-flux
@bkastl

Yep. Da bin ich jetzt echt überrascht
🤣
@bkastl Der Link führt zu einem anderen Paper, dieser müsste richtig sein: https://arxiv.org/abs/2510.07192
Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples

Poisoning attacks can compromise the safety of large language models (LLMs) by injecting malicious documents into their training data. Existing work has studied pretraining poisoning assuming adversaries control a percentage of the training corpus. However, for large models, even small percentages translate to impractically large amounts of data. This work demonstrates for the first time that poisoning attacks instead require a near-constant number of documents regardless of dataset size. We conduct the largest pretraining poisoning experiments to date, pretraining models from 600M to 13B parameters on chinchilla-optimal datasets (6B to 260B tokens). We find that 250 poisoned documents similarly compromise models across all model and dataset sizes, despite the largest models training on more than 20 times more clean data. We also run smaller-scale experiments to ablate factors that could influence attack success, including broader ratios of poisoned to clean data and non-random distributions of poisoned samples. Finally, we demonstrate the same dynamics for poisoning during fine-tuning. Altogether, our results suggest that injecting backdoors through data poisoning may be easier for large models than previously believed as the number of poisons required does not scale up with model size, highlighting the need for more research on defences to mitigate this risk in future models.

arXiv.org
@bkastl noch ein Grund mehr KI und alle Produkte die einem diese aus Auge drücken zu boykottieren. ✊🏻