| Site | https://jotter.jonathankingston.co.uk/ |
| Site | https://jotter.jonathankingston.co.uk/ |
AI agents are clicking buttons with your credentials. WebMCP lets pages declare tools for agents to call, but it trusts the page to be honest. That's the same assumption that gave us phishing.
I wrote about why agentic AI needs a real consent layer, not just better sandboxes.
https://jotter.jonathankingston.co.uk/blog/2026/02/22/consent-is-all-you-need/
Most teams treat skills, MDC rules and system prompts as write-once artifacts, refined by vibes. The post looks at two practical approaches to actually measuring whether they work: deterministic rubric testing and paired comparisons borrowed from RLHF.
https://jotter.jonathankingston.co.uk/blog/2026/02/17/magic-words-need-measuring-sticks/
2. Intercepts all loads within the mhtml load, and if in the archive will serve it. The problem here is more complexity. There doesn’t seem to be a lightweight lower risk version here like there is for 1. So we’re left with manipulating network and docshell lightly to creating the archive and intercepting loads.
IIRC that riskier pars only adds ~80 lines of code into loadinfo and Docshell. I think that risk is worth it. I’ve asked for Mozilla to consider more deeply landing this code.
The two credible approaches are:
1. Building a parser to manipulate MHTML at runtime to piece it all back together into HTML.
2. Creating a simple parser that splits the Mime chunks, registers an archive to a service (a bit like blobs do). That document loader sets up interception of assets like a service worker does but serves from the archive.
1. Done robustly would probably need manipulation into the HTML and CSS parsing, pragmatically can be done on the exterior but will likely miss stuff.
I’m now on probably iteration 4 of the approach, importing is orders of magnitude more complicated than export despite Cursor initially suggesting otherwise.
I think there’s probably 6 high level approaches that could be taken; all have different trade offs.
I previously tried the approach i currently think is the best approach but gave up due to complexity. Having built the simplest approach and seeing someone at Mozilla do the same I think it reaffirmed trying the harder approach again.
After a bit more vibing, I have mhtml files opening, CID referencing support.
I've isolated loads to deny network requests and given the file unique origin to prevent storage leakage.
I've also cribbed Chromium tests to validate the parser, I've used a reference test to compare html and mhtml rendering output.