Mastodawn

Reading analysis of the Claude Code leak (not reading the code itself, of course) is evidence towards what I had kind of suspected, that the whole thing is a giant magic trick not only in the straightforward LLMentalist way, but also in the sleight of hand way off making you think that this pile of regexes and JSON schema validation loops is *actually* the LLM doing LLM things.

Show thread

Cassandra is only carbon now Mar 31

Like, you don't need LLMs, the things that work, work well, and that have worked well for decades are all there, being called by the chatbots... you can just actually use those without 500k lines of spaghetti code and markdown files tricking you into thinking that the JSON parser is alive and has feelings.

Show thread

AstraLuma Mar 31

@xgranade the worst part?

It occurred to me that we can already easily tokenize code, and know if a string of tokens is valid.

So they could just have "start json" and "end json" tokens and not pick invalid tokens in the middle

Show thread

vriska, thief of light

@astraluma @xgranade p sure this is what openai's done for years

Show thread

vriska, thief of light

Apr 1

@astraluma @xgranade actually p sure this is what anthropic does too i havent looked at the leak either but i dont trust the analysis referenced in the op https://platform.claude.com/docs/en/build-with-claude/structured-outputs

Structured outputs

Get validated JSON results from agent workflows

Claude API Docs

Show thread

AstraLuma Apr 1

@leo @xgranade not according to https://neuromatch.social/@jonny/116325123136895805

jonny (good kind) (@[email protected])

So the reason that Claude code is capable of outputting valid json is because if the prompt text suggests it should be JSON then it enters a special loop in the main query engine that just validates it against JSON schema (it looks like the schema just validates that something in fact and object and its keys are strings) and then feeds the data with the error message back into itself until it is valid JSON or a retry limit is reached. This code is so eye wateringly spaghetti so I am still trying to see if this is true, but this seems to be how it not only returns json to the user, but how it handles *all* LLM-to-JSON, including internal output from its tools. There appears to be an unconditional hook where if the JSON output tool is present in the session config at all, then all tool calls must be followed by the "force into JSON" loop. If that's true, that's just *mind blowingly expensive* edit: please note that unless I say otherwise all evaluations here are just from my skimming through the code on my phone and have not been validated in any way that should cause you to be upset with me for impugning the good name of anthropic edit2: this is both much worse and not as bad as i thought on first read - https://neuromatch.social/@jonny/116326861737478342

neurospace.live

Show thread

vriska, thief of light

Apr 1

@astraluma @xgranade oh. they're doing exactly what you suggested, the referenced retry logic is for if it never emits the start json token

Show thread

vriska, thief of light

Apr 1

@astraluma @xgranade the post is just describing that in the most confusing way possible as far as i can tell (and their code is also possibly written confusingly but whatever that's their problem)