Mastodawn

One of the funniest aspects of LLMs is how we went from "Untrusted input needs a Hazmat suit" to "everything is input data, including the names of the functions in the tool's source code"

Show thread

Jordan Petridis 1d ago

You need to read this thread, its legit hilarious.

https://neuromatch.social/@jonny/116324676116121930

Show thread

Felicitas Pojtinger 🌅1d ago

@alatiera The funniest one to me is when people say "oh just use a sandbox for the personal assistants" completely ignoring that as soon as you fetch any relevant 3rd party data into the context as "parameters" those parameters can now make the app and anything consuming the return values do basically anything

"What if we made SQL injections a first-class feature in every app" etc.

Doesn't matter for anything that's not like OpenClaw but still, it's crazy how unaware people are of this

Show thread

Jordan Petridis 1d ago

@pojntfx These things should only be allowed to run on airgap systems that are legit shredded and melted down afterwards. I swear

Show thread

Felicitas Pojtinger 🌅1d ago

@alatiera Lmao

It's kind of a bummer that this is like a fundamental design flaw of them. I have a few friends who demoed what's essentially client-side search engines built on top of this stuff and it's incredible. You can give it queries like "find that one issue with IPv6 in GNOME OS and find what's been changed since the last time I visited" and it will actually find it. You can even have it ping you when things change. But the second you give it any kind of access token you've (1/2)

Show thread

Felicitas Pojtinger 🌅1d ago

@alatiera allowed basically anyone to do anything with your account by pasting a random string in a GitLab comment.

I've seen people add "read-only email access" to their personal assistants, completely forgetting that yk every password reset flow uses the email so it's actually effectively a write permission to every account 🙃

Show thread

Jordan Petridis 1d ago

@pojntfx The NLP aspects of these things are impressive, and even more so when you realize they are working this well by pure accident and word-association.

Show thread

Felicitas Pojtinger 🌅1d ago

@alatiera Yeah exactly

One of the better uses I've recently discovered is the new search on Vercel (terrible company otherwise but yk). You press cmd-k with any query, like "add work account", and it drops you to the right place in the settings in the right project etc.

No mention of LLMs anywhere either, it's just a simple "no keywords found" → fallback to LLM which works very nicely because gosh is finding these things otherwise annoying

Show thread

Felicitas Pojtinger 🌅1d ago

@alatiera Looks like this

Show thread

Felicitas Pojtinger 🌅1d ago

@alatiera In e.g. GNOME settings those kinds of flows are way more frustrating if you don't know where to look

Show thread

pan 1d ago

@pojntfx @alatiera I don't think you need an ai for that tho, unless you wanna go the "hello i would like to change my username please" route

Show thread

Felicitas Pojtinger 🌅1d ago

@pan @alatiera Better tags could help too, yeah, but there is the whole "how fuzzy do you want to be" question. E.g. "change username" - do you show both "change" and "username"? That's going to show you "change computer name", "change wallpaper" etc. as well or just ... nothing, which is what it does rn

Show thread

Felicitas Pojtinger 🌅1d ago

@pan @alatiera Actually, in this case that's because I entered an English query on a French system it turns out haha

Show thread

Felicitas Pojtinger 🌅1d ago

@pan @alatiera Although if you do make it an actual natural language "search" it still breaks. Just "user change" works