Reading analysis of the Claude Code leak (not reading the code itself, of course) is evidence towards what I had kind of suspected, that the whole thing is a giant magic trick not only in the straightforward LLMentalist way, but also in the sleight of hand way off making you think that this pile of regexes and JSON schema validation loops is *actually* the LLM doing LLM things.
Like, you don't need LLMs, the things that work, work well, and that have worked well for decades are all there, being called by the chatbots... you can just actually use those without 500k lines of spaghetti code and markdown files tricking you into thinking that the JSON parser is alive and has feelings.

@xgranade the worst part?

It occurred to me that we can already easily tokenize code, and know if a string of tokens is valid.

So they could just have "start json" and "end json" tokens and not pick invalid tokens in the middle

@astraluma It continues to be incredibly strange to me that llmbros keep limiting their approach to in-band signaling.
@xgranade @astraluma I'm not even sure how you'd do out-of-band signalling in an LLM, the model fundamentally sees it all as just a long blob

@orman as far as I know there are special tokens to mark whether the content following is a system message, a chatbot message or a user's message.

These tokens are special in the way that you can't inject them through a user message. You need direct access to the token stream to insert them (not the text that is to be tokenized).

But yes in the end it's still just a large sequence of tokens - but so is e.g. escaping some text for presentation in an HTML document.