Manoj Kasichainula

@headmold
100 Followers
87 Following
203 Posts
A lemon gives by taking and cares by yelling. Former security at Google, Asana, Apache. If you're holding a snake right now, press 4.
Blueskyhttps://headmold.bsky.social
Profile photo by@bdowney
https://simonwillison.net/2026/Jun/11/anthropic-walks-back-policy/ "We made the wrong tradeoff and we apologize for not getting the balance right." I think someone at Anthropic had to snip out "Honestly," from the statement before sending it over.
Anthropic Walks Back Policy That Could Have ‘Sabotaged’ AI Researchers Using Claude

Big scoop for Maxwell Zeff at Wired: “We’re changing Fable 5’s safeguards for frontier LLM development to make them visible.” Anthropic said in a statement to WIRED. “We made the …

Simon Willison’s Weblog
To be clear, I don't see much if any information on how this works still, so Apple devs could surprise me with their cleverness (or my lack) as I learn more.
@freddy Heh yes. It could be marketing puffery and not use any ML, but I haven't figured out how it could do what they say reliably in that case, either. If it's just a collection of hardcoded heuristics, I'd have different worries. (I haven't decided if lesser or greater yet, because the worries aren't fully formed.)

@freddy Ooh, I didn't know about this, thanks!

I'd guess the Apple Intelligence here would be
- to know how to navigate that well-known link, which uhh *probably* wouldn't have user-generated content? (And if keeping UGC out wasn't a good enough practice before, it may well be now!)
- to work on sites that don't support that well-known URL. I assume this is a lot of sites. The first site I tried did not support it.

If not, there'd be no need for "AI", after all. So I don't think that's enough.

With a bit less jargon: On some sites, Apple's agent might need to read pages full of user-generated text to find the "change password" link. The text could trick Apple's agent into letting an attacker hijack your account.

If Apple wasn't careful, the "confusion" could even spread to other sites.

@tychotithonus I'm not even sure that's enough. What if the prompt injection just tells the bot to change all users' passwords to [dGhpcyBpcyBhIGJhZCBwYXNzd29yZAo=]? How is a user supposed to understand that this is bad?

Besides that, I worry that going into too much detail and requiring confirmation at each step would be slower than just doing it manually or be so verbose that most users just repeatedly click "OK".

@tychotithonus Yeah, I'm pondering for ways this could be done safely (without just cheating and using not-actually-machine-learning for this), and so far I'm failing.
ahahaha uhhhhh, I'd *like* to think smart security people at Apple were on this and cut off one of the legs of the lethal trifecta, but uhhhhh... https://www.macrumors.com/2026/06/08/apple-passwords-can-now-automatically-fix-passwords-with-agentic-ai/
Apple Passwords Can Now Automatically Fix Weak and Compromised Passwords With Agentic AI

Apple today announced that the Passwords app can now automatically update weak and compromised passwords using Apple Intelligence and Safari to take...

MacRumors
@lcamtuf So does The Joker.