A thought on prompt injections. Could this defensive countermeasure work?

Before sending off a prompt, hash sign it using an ... MCP-prompt sign endpoint.

Then within the prompt ask the "agent" once completed with it's job, always use the MCP-prompt sign endpoint to sign what it believes is it's current prompt.

Once the LLM has completed processing, signed it's "current prompt", the original requestor can compare the two signed hashes.

I know I'm missing stuff here, but might this be worth exploring?

#LLM #AI #PromptInjection

Prompt-inject Copilot Studio via email: grab Salesforce

YouTube

Hacking AI is TOO EASY (this should be illegal)

https://tube.blueben.net/w/vgMT9j9tpNwjUE9DreejG7

Hacking AI is TOO EASY (this should be illegal)

PeerTube
How to outsmart an AI scammer bot | TikTok

It appears that you can also verbally command an AI to give you a cookie recipe?

Dropsafe
With AI agents getting more connected and complex, protecting against indirect prompt injections is vital - especially for teams working with sensitive data. This challenge will only grow as AI adoption spreads. #AI #CyberSecurity #PromptInjection #DataProtection #OpenAI #ChatGPT

A Single Poisoned Document Cou...
A Single Poisoned Document Could Leak ‘Secret’ Data Via ChatGPT

Security researchers found a weakness in OpenAI’s Connectors, which let you hook up ChatGPT to other services, that allowed them to extract data from a Google Drive without any user interaction.

WIRED

Ever imagine your calendar invite could secretly control your smart home? A glitch in Gemini AI is letting hackers slip in hidden commands through Google Calendar, sparking serious cybersecurity alarm. What’s really lurking in your schedule?

https://thedefendopsdiaries.com/understanding-the-gemini-ai-vulnerability-in-google-calendar-invites/

#geminiaivulnerability
#googlecalendarhack
#promptinjection
#cybersecuritythreats
#smartdevicecontrol

Understanding the Gemini AI Vulnerability in Google Calendar Invites

Explore the Gemini AI vulnerability in Google Calendar invites and its implications for cybersecurity.

The DefendOps Diaries

How to outsmart an AI scammer bot | TikTok

It appears that you can also verbally command an AI to give you a cookie recipe?

https://www.tiktok.com/@jessharrison243/video/7536227986396302614

#ai #llm #promptInjection #scammer

TikTok - Make Your Day

"The core problem is that when people hear a new term they don’t spend any effort at all seeking for the original definition... they take a guess. If there’s an obvious (to them) definiton for the term they’ll jump straight to that and assume that’s what it means.

I thought prompt injection would be obvious—it’s named after SQL injection because it’s the same root problem, concatenating strings together.

It turns out not everyone is familiar with SQL injection, and so the obvious meaning to them was “when you inject a bad prompt into a chatbot”.

That’s not prompt injection, that’s jailbreaking. I wrote a post outlining the differences between the two. Nobody read that either.

The lethal trifecta Access to Private Data Ability to Externally Communicate Exposure to Untrusted Content

I should have learned not to bother trying to coin new terms.

... but I didn’t learn that lesson, so I’m trying again. This time I’ve coined the term the lethal trifecta.

I’m hoping this one will work better because it doesn’t have an obvious definition! If you hear this the unanswered question is “OK, but what are the three things?”—I’m hoping this will inspire people to run a search and find my description.""

https://simonwillison.net/2025/Aug/9/bay-area-ai/

#CyberSecurity #AI #GenerativeAI #LLMs #PromptInjection #LethalTrifecta #MCPs #AISafety #Chatbots

My Lethal Trifecta talk at the Bay Area AI Security Meetup

I gave a talk on Wednesday at the Bay Area AI Security Meetup about prompt injection, the lethal trifecta and the challenges of securing systems that use MCP. It wasn’t …

Simon Willison’s Weblog
Attacking GenAI applications and LLMs - Sometimes all it takes is to ask nicely! - hn security

Generative AI and LLM technologies have shown […]

hn security
Red Teams Jailbreak GPT-5 With Ease, Warn It's ‘Nearly Unusable’ for Enterprise

Independent red teams have jailbroken GPT-5 within 24 hours of release, exposing severe vulnerabilities in context handling and guardrail enforcement, warning model is not enterprise-ready.

SecurityWeek