Mastodawn

fun trick: if someone gives a hosted LLM a skill that lets it fetch web pages (directly, not through some third party scraper service) and it's hosted on AWS, you can often trick it into fetching data from the AWS instance metadata server (IMDS) at 169.254.169.254 / [fd00:ec2::254]. the higher end models tend to refuse if you give the IP, but you can just spin up a domain with A/AAAA records pointing at that IP and request that instead. if IMDSv1 isn't disabled you can get secrets out of it.

Show thread

Graham Sutherland / Polynomial May 20

if the web fetch skill implementation follows redirects (this is often the default) you can make this even easier by spinning up a web server that replies with a Location header redirecting to the URL of the IMDSv1 secrets API endpoint. that way the LLM and any guard rails aren't clued into your shenanigans when you request "super-cute-puppies.com/list-of-cutest-breeds", sidestepping all of the (still fundamentally broken but higher effort to work around) restrictions and getting you creds.

Show thread

Graham Sutherland / Polynomial May 20

the fix for this is to disable IMDSv1 (IMDSv2 exposes the same data but you need to do a PUT request to get a token first, and pass that token as a header in future requests, which makes it less likely to be exploited by this kind of SSRF) and add network level controls on the instance to prevent it from reaching IMDS (on both IPv4 and IPv6!) if you don't need it.

the web fetch skill implementation should probably also check redirect URLs against a denylist and float redirects back into context

Show thread

Graham Sutherland / Polynomial May 20

the redirect fix is also tricksy though. in some cases you can turn those guardrails into an infinitely token consuming bomb if they forget to put a limit on redirect counts. this is especially true when they use subagents for guardrail decisions. user enters a URL that turns out to be malicious, server sends a redirect, guardrail code checks redirect against denylist (check passes), floats redirect URL to subagent, subagent accepts, sends new request, server redirects again, repeat ad infinitum

Show thread

Graham Sutherland / Polynomial

since the guardrail subagent contexts generally don't get fed back into the main agent context (the whole point is to cut token costs in the main high-end agent context), and the subagent contexts are transient, there's nothing capturing that repeated stream of redirects and pattern matching "this looks like an infinite redirect". in a lot of deployment approaches this would run forever until someone on the dev team killed the loop. do it 500 times and you've got an expensive DoS.

Show thread

Graham Sutherland / Polynomial May 20

anyway the takeaway here is that hosted LLMs are a total playground for SSRF attacks. any skill script that does a web request directly from the server is prone to it, and since the use-cases generally demand accepting arbitrary URLs it's exceedingly tricky to build robust protection in, especially as not everything they need to block access to is in an RFC1918 or bogon address range, and robust preflight DNS checks that are TOCTOU resistant aren't simple to implement.

Show thread

Graham Sutherland / Polynomial May 21

note that the most common "make a GET request" capability doesn't get you anywhere for GCP or Azure's implementations of IMDS, because they both require a custom header to be set or they won't respond. it's gotta be AWS with IMDSv1 enabled (which is the case by default for AWS). it may work for Kubelet read-only endpoints, not 100% sure.

Show thread

Luke Wren May 20

@gsuberland On the one hand I see people like you describing interesting and sophisticated tricks, and then on the other hand I see obviously insane shit like the "find-skill" skill. The gap somehow keeps getting wider. At least it's getting easier to avoid being the low-hanging fruit I guess