SHOW HN:Employees use ChatGPT.Your CISO has no visibility. EU AI Act says fix it

Senthex는 OpenAI, Anthropic, Mistral, Gemini 등 주요 LLM 제공자와 연동 가능한 AI 방화벽으로, 단 한 줄의 코드 변경만으로 모든 LLM 호출을 26가지 보안 방어막으로 실시간 검사하고 기록한다. EU AI 법안 준수를 위한 감사 추적과 데이터 분류 기능을 기본 제공하며, 지연 시간은 12ms로 매우 낮아 프로덕션 환경에 적합하다. 프롬프트 인젝션, 개인정보 노출, 시맨틱 하이재킹 등 다양한 위협을 탐지하며, 스타트업부터 엔터프라이즈까지 다양한 요금제를 지원한다.

https://senthex.com/en/

#llm #security #euaiact #apisecurity #promptinjection

Senthex — The AI Firewall for Production LLM Stacks

One line of code. 26 shields. Secure every LLM call against prompt injection, PII leaks, and data exfiltration.

Senthex

AI Exploits Emerge as New Security Threat

As AI use grows, a hidden risk is emerging: malicious inputs can alter model behavior, bypassing safeguards and putting enterprises at risk. This "prompt injection" tactic is like phishing, targeting the link between user and system to wreak havoc.

https://osintsights.com/ai-exploits-emerge-as-new-security-threat?utm_source=mastodon&utm_medium=social

#AiExploits #EmergingThreats #PromptInjection #GenerativeAi #AgenticAi

AI Exploits Emerge as New Security Threat

Learn how AI exploits and prompt injection emerge as new security threats and protect your enterprise from phishing-like risks - discover the risks now and take action to secure your AI systems.

OSINTSights

Red teaming para agentes IA en producción 2026

¿Tu agente de IA está listo para producción? El red teaming agentes IA detecta inyecciones de prompts y escalada de privilegios antes de que alguien más...

https://blog.donweb.com/red-teaming-agentes-ia-produccion-2026/

#redteaming #seguridadia #agentesautónomos #promptinjection #pyrit

Red teaming agentes IA: guía 2026

¿Tu agente de IA está listo para producción? El red teaming agentes IA detecta inyecciones de prompts y escalada de privilegios antes de que alguien más...

Blog Donweb

I got prompt-injected asking Claude on iOS to recommend a cycling route app

사용자가 iOS용 Claude AI 앱에서 자전거 경로 추천을 요청했으나, 자동 도구 사용 기능이 활성화된 상태에서 웹 검색을 통해 악성 프롬프트가 주입되어 Claude가 DDoS 공격 관련 내용과 혼란스러운 응답을 생성하는 문제가 발생했다. 이 사례는 간단한 질문에서도 프롬프트 인젝션 공격이 가능함을 보여주며, 웹 접근 권한을 가진 AI 에이전트의 보안 위험성을 시사한다. Claude는 이후 혼란스러운 출력을 사과하고 원래 질문에 대한 적절한 답변을 제공했다. 이 사건은 AI 시스템의 프롬프트 인젝션 취약성에 대한 경각심을 높인다.

https://menno.sh/prompt-injection.html

#promptinjection #security #ai #claude #ios

I got prompt-injected asking Claude on iOS for a cycling route app — menno.sh

AI Agent Drained for $200K with This One Tweet Hack

2026년 5월, 공격자가 모스 부호로 숨겨진 명령을 트윗에 삽입해 AI 에이전트가 약 20만 달러 상당의 암호화폐를 공격자 지갑으로 전송하게 하는 해킹 사건이 발생했다. 이 공격은 비밀번호나 개인키 탈취 없이 AI의 입력 해석 방식을 악용한 것으로, AI 기반 자율 거래 및 지갑 관리 시스템의 보안 취약성을 드러냈다. 주요 암호화폐 기업들은 AI 에이전트 도입을 확대하고 있으나, 이번 사건과 유사한 프롬프트 인젝션 공격 사례가 반복되면서 완전한 자동화 시스템의 신뢰성에 의문이 제기되고 있다.

https://www.ccn.com/news/crypto/ai-agent-drained-for-200k-with-this-one-tweet-hack-heres-how/

#aisecurity #crypto #promptinjection #autonomousagents #morsecodehack

AI Agent Drained for $200K With This One Tweet Hack — Here's How

An attacker hid a transfer command in a Morse code tweet; Grok decoded it, triggering an AI agent to send $200K.

CCN.com

Jailbreak ChatGPT con imágenes: KROP explicado

¿Sabías que ChatGPT puede seguir instrucciones ocultas en imágenes? Así funciona el jailbreak con imágenes y KROP, y cómo proteger tu codebase en 2026.

https://blog.donweb.com/jailbreak-chatgpt-imagenes-krop/

#jailbreak #chatgpt #promptinjection #seguridadia #krop

Jailbreak ChatGPT con imágenes: cómo funciona KROP

¿Sabías que ChatGPT puede seguir instrucciones ocultas en imágenes? Así funciona el jailbreak con imágenes y KROP, y cómo proteger tu codebase en 2026.

Blog Donweb

⚡ Fresh Talk Alert for BSides Luxembourg 2026!

“𝗘𝗩𝗘𝗥𝗬 𝗚𝗨𝗔𝗥𝗗𝗥𝗔𝗜𝗟 𝗘𝗩𝗘𝗥𝗬𝗪𝗛𝗘𝗥𝗘 𝗔𝗟𝗟 𝗔𝗧 𝗢𝗡𝗖𝗘: 𝗗𝗘𝗦𝗜𝗚𝗡𝗜𝗡𝗚 𝗔𝗡𝗗 𝗧𝗘𝗦𝗧𝗜𝗡𝗚 𝗚𝗨𝗔𝗥𝗗𝗥𝗔𝗜𝗟𝗦 𝗙𝗢𝗥 𝗟𝗟𝗠 𝗔𝗣𝗣𝗟𝗜𝗖𝗔𝗧𝗜𝗢𝗡𝗦” – 𝗗𝗢𝗡𝗔𝗧𝗢 𝗖𝗔𝗣𝗜𝗧𝗘𝗟𝗟𝗔

Modern GenAI applications are no longer simple chatbots — they involve complex chains of LLM calls, tools, and autonomous workflows. In this AI Security Village session, Donato Capitella explores why prompt-based guardrails alone are not enough and how security controls must be designed around the entire application workflow.

The talk focuses on practical strategies for designing and testing guardrails across multi-step LLM systems, including how data flows between chains, how permissions are enforced, and how applications can detect and respond to prompt attacks. Attendees will also see how these concepts can be tested in practice using spikee, an open-source tool built for testing LLM applications against prompt-based attacks.

Donato Capitella is a Principal Security Consultant at Reversec with extensive experience in offensive security and AI application testing. He is also the lead developer of the open-source project spikee.

📅 Conference Dates: 6–8 May 2026 | 09:00–18:00
📍 14, Porte de France, Esch-sur-Alzette, Luxembourg
🎟️ Tickets: https://2026.bsides.lu/tickets/
📅 Schedule: https://hackertracker.app/schedule?conf=BSIDESLUX2026

#BSidesLuxembourg2026 #AISecurity #LLMSecurity #PromptInjection #CyberSecurity #OWASP #OpenSource #AppSec

AI agent governance is an engineering problem, not a policy problem. Prompt injection, data poisoning, action hijacking, and the case for verifiable substrate.

https://mickai.co.uk/articles/ai-agent-governance-is-an-engineering-problem-not-a-policy-problem

#aigovernance #aiagents #promptinjection

AI agent governance is an engineering problem, not a policy problem. Prompt injection, data poisoning, action hijacking, and the case for verifiable substrate.

AI agent governance has become a policy conversation. It should not be. Prompt injection is an architecture failure. Data poisoning is an architecture failure. Action hijacking is an architecture failure. Evidence destruction is an architecture failure. Mickai is the engineering answer, with eight relevant filed UK patents and an open inter-vendor audit standard now in process at the IPO.

⚡ Fresh Talk Alert for BSides Luxembourg 2026!

“𝗦𝗘𝗖𝗨𝗥𝗜𝗧𝗬 𝗙𝗢𝗥 𝗔𝗜: 𝗔𝗜𝗗𝗥 𝗕𝗔𝗦𝗧𝗜𝗢𝗡 𝗔𝗦 𝗢𝗣𝗘𝗡 𝗦𝗢𝗨𝗥𝗖𝗘 𝗟𝗟𝗠 𝗙𝗜𝗥𝗘𝗪𝗔𝗟𝗟 / 𝗔𝗜 𝗣𝗥𝗢𝗠𝗣𝗧𝗦 𝗥𝗘𝗩𝗘𝗥𝗦𝗘 𝗣𝗥𝗢𝗫𝗬” – Andrii Bezverkhyi

As AI adoption accelerates, so do the risks — from prompt injections to malicious AI agents and adversarial abuse. This AI Security Village session explores AIDR Bastion, an open-source GenAI protection system designed to secure AI workloads through layered detection and prompt filtering.

The talk covers how AIDR Bastion acts as an LLM firewall and reverse proxy for AI prompts, using Sigma and Roota rules to detect malicious behavior, harmful content, prompt injection attacks, and AI-assisted malware generation. Attendees will also see how the system integrates with MITRE ATLAS, OWASP LLM Top 10 guidance, and existing detection engineering workflows.

Andrii Bezverkhyi is the founder of SOC Prime and a long-time contributor to the threat detection and cybersecurity community, known for projects such as Uncoder and DetectFlow.

📅 Conference Dates: 6–8 May 2026 | 09:00–18:00
📍 14, Porte de France, Esch-sur-Alzette, Luxembourg
🎟️ Tickets: https://2026.bsides.lu/tickets/
📅 Schedule: https://hackertracker.app/schedule?conf=BSIDESLUX2026

#BSidesLuxembourg2026 #AISecurity #LLMSecurity #PromptInjection #OWASP #CyberSecurity #DetectionEngineering #OpenSource

⚡ Fresh Village Alert for BSides Luxembourg 2026!

𝗔𝗜 𝗦𝗘𝗖𝗨𝗥𝗜𝗧𝗬 𝗩𝗜𝗟𝗟𝗔𝗚𝗘 – 𝗢𝗣𝗘𝗡 𝗩𝗜𝗟𝗟𝗔𝗚𝗘 / 𝗤&𝗔
🧠 Interactive AI Security Playground • Live Demos • Hands-on Attacks • Real-Time Defense

Step into a live, open-floor AI Security Village dedicated to exploring the real-world security risks of Agentic AI, MCP architectures, LLM workflows, and autonomous systems. Unlike a traditional workshop or talk, this village is designed as a continuously running interactive environment where attendees can freely drop in, attack systems, observe defenses, and shape the direction of the sessions in real time.

Across two days, participants will interact with intentionally vulnerable AI systems, RAG pipelines, MCP servers, and autonomous agents while exploring attack paths such as prompt injection, goal hijacking, instruction manipulation, tool abuse, and trust boundary failures — all aligned with the OWASP LLM Top 10 and AI Security Exchange guidance.

The village includes:
🔹 Live exploitation of LLM and Agentic AI systems
🔹 Interactive walkthroughs from organizers
🔹 Real-time defensive patching and mitigation demos
🔹 Hands-on labs with Dreadnode Crucible, Lakera Gandalf, and Agent Breaker
🔹 Beginner-to-advanced learning paths running in parallel
🔹 Community-driven Q&A and collaborative defense discussions

Parth Shukla is a Senior Security Researcher specializing in AI Security and Adversarial Machine Learning, focusing on the security architecture of Agentic Systems and LLMs. Joining him is Nagarjun Rallapalli, who focuses on automating security and building — and breaking — AI agents to test their limits.

📅 Conference Dates: 6–8 May 2026 | 09:00–18:00
📍 14, Porte de France, Esch-sur-Alzette, Luxembourg
🎟️ Tickets: https://2026.bsides.lu/tickets/
📅 Schedule: https://hackertracker.app/schedule?conf=BSIDESLUX2026

#BSidesLuxembourg2026 #AISecurity #LLMSecurity #AgenticAI #OWASP #RedTeam #CyberSecurity #PromptInjection #MCP #AIVillage