90 Followers
20 Following
125 Posts
Hacking neural networks so that we don't get stuck in the matrix.
Entrepreneur. Author. Red Team Director.
Bloghttps://embracethered.com

Trust No AI: Prompt Injection Along The CIA Security Triad

Johann Rehberger (Independent Researcher, Embrace The Red)
https://arxiv.org/abs/2412.06090 https://arxiv.org/pdf/2412.06090 https://arxiv.org/html/2412.06090

arXiv:2412.06090v1 Announce Type: new
Abstract: The CIA security triad - Confidentiality, Integrity, and Availability - is a cornerstone of data and cybersecurity. With the emergence of large language model (LLM) applications, a new class of threat, known as prompt injection, was first identified in 2022. Since then, numerous real-world vulnerabilities and exploits have been documented in production LLM systems, including those from leading vendors like OpenAI, Microsoft, Anthropic and Google. This paper compiles real-world exploits and proof-of concept examples, based on the research conducted and publicly documented by the author, demonstrating how prompt injection undermines the CIA triad and poses ongoing risks to cybersecurity and AI systems at large.

Trust No AI: Prompt Injection Along The CIA Security Triad

The CIA security triad - Confidentiality, Integrity, and Availability - is a cornerstone of data and cybersecurity. With the emergence of large language model (LLM) applications, a new class of threat, known as prompt injection, was first identified in 2022. Since then, numerous real-world vulnerabilities and exploits have been documented in production LLM systems, including those from leading vendors like OpenAI, Microsoft, Anthropic and Google. This paper compiles real-world exploits and proof-of concept examples, based on the research conducted and publicly documented by the author, demonstrating how prompt injection undermines the CIA triad and poses ongoing risks to cybersecurity and AI systems at large.

arXiv.org
Took some time today to catch up with Johann Rehberger's Month of AI Bugs and wow... 15 examples so far of major prompt injection vulnerabilities in products including ChatGPT, Codex, Cursor, Amp, Devin, Claude Code, GitHub Copilot and Google Jules https://simonwillison.net/2025/Aug/15/the-summer-of-johann/
The Summer of Johann: prompt injections as far as the eye can see

Independent AI researcher Johann Rehberger (previously) has had an absurdly busy August. Under the heading The Month of AI Bugs he has been publishing one report per day across an …

Simon Willison’s Weblog

39C3: Security researcher hijacks AI coding assistants with prompt injection

At 39C3, Johann Rehberger showed how easily AI coding assistants can be hijacked. Many vulnerabilities have been fixed, but the fundamental problem remains.

https://www.heise.de/en/news/39C3-Security-researcher-hijacks-AI-coding-assistants-with-prompt-injection-11125687.html?wt_mc=sm.red.ho.mastodon.mastodon.md_beitraege.md_beitraege&utm_source=mastodon

#ChaosCommunicationCongress #Datenleck #IT #KünstlicheIntelligenz #Malware #Sicherheitslücken #news

39C3: Security researcher hijacks AI coding assistants with prompt injection

At 39C3, Johann Rehberger showed how easily AI coding assistants can be hijacked. Many vulnerabilities have been fixed, but the fundamental problem remains.

heise online
Absolutes #Mustsee: Johann Rehberger: Agentic ProbLLMs: Exploiting AI Computer-Use and Coding Agents media.ccc.de/v/39c3-agent...

Agentic ProbLLMs: Exploiting A...
Agentic ProbLLMs: Exploiting AI Computer-Use and Coding Agents

media.ccc.de

Agentic ProbLLMs - Exploiting AI Computer-Use and Coding Agents with Johann Rehberger

https://video.infosec.exchange/w/nLbLethQjNMAVz7dMkURK5

Agentic ProbLLMs - Exploiting AI Computer-Use and Coding Agents with Johann Rehberger

PeerTube

🔥 New blog post: AI ClickFix!

Explores how classic ClickFix social engineering attacks can target AI agents, like Claude Computer-Use.

Learn what ClickFix is, how it works in detail, and see a working proof-of-concept. Scary stuff. 👇

https://embracethered.com/blog/posts/2025/ai-clickfix-ttp-claude/

AI ClickFix: Hijacking Computer-Use Agents Using ClickFix · Embrace The Red

Embrace The Red

🔥 SpAIware & More: Advanced Prompt Injection Exploits in LLM Applications 🔥

👉 Black Hat posted my talk to YouTube - Enjoy!🍿😈

A wild journey of exploits, peaking in compromising ChatGPT's long term memory for continuous remote command and control! 😱

https://www.youtube.com/embed/84NVG1c5LRI

SpAIware & More: Advanced Prompt Injection Exploits in LLM Applications

YouTube

Some LLM vendors fixed this at the API level, but not all.

This leaves the responsibility to know about this attack vector and mitigate it with developers & testers.

AI Application Security is a thing!

Trust No AI

So, we humans don't see these Unicode Tags, but many LLMs do.

And LLMs not only see them, they follow the hidden instructions! ⚠️

Here the actual post that Ask Perplexity made.

It's a common vulnerability in AI applications & agents.

Many "summarize this email" or "summarize this document", "do sentiment analysis" features are vulnerable to this

What happened there? 🧐

👉 The original post with the question contains hidden Unicode Tag code points.

Unicode Tags mirror ASCII, but are invisible in UI elements. 👀