35 Followers
142 Following
3.4K Posts
I write software and fiction.
Books!https://amberopposition.com
Barcodes!https://cimbar.org
Github!https://github.com/sz3

Google's autonymous taxi company says it won't obey traffic safety laws because its customers want convenience, so screw those bicyclists who don't want to be injured or killed.

https://road.cc/news/driverless-taxis-veering-into-cycle-lanes-normal-practice-says-waymo

Expecting driverless taxis to respect bike lanes “too high a bar” – because customers want to be dropped off in them, autonomous vehicle firm Waymo tells cyclists

Waymo, the autonomous driving tech firm whose so-called ‘robo-taxis’ are now roaming the streets of London, has allegedly told cycling campaigners that expecting their driverless cars to respect cycle lanes is “too high a bar” – because their customers want to be dropped off in them.But Waymo has denied making such a claim, instead pointing ... Read more

road.cc

The little devil on my shoulder wants me to tell you about https://learn.microsoft.com/en-us/microsoft-365/copilot/copilot-flex-routing

Apparently #Microsoft is not able to get enough compute within EU datacenters to handle #Copilot requests.

Instead, it will do "Flex-Routing", which processes some requests in non-EU datacenters. This is Opt-Out. The only notification was an e-mail to Admins. If they missed that, companies might be leaking PII outside of the EU from tomorrow on.

Get your GDPR Nightmare letters ready!

Flex routing (EU and EFTA)

Learn about flex routing and how it affects inferencing for Microsoft 365 Copilot and Copilot chat during times of peak load.

Age verification is a deliberate attack on system sovereignty, both for individuals and countries. There’s no “age verifcation”, there is only “identity verification that includes age”, and the system doing that verification is not just a privacy-invasive user tracking system but a remotely controlled off switch for anyone of any age.

Every time I cover E2E verifiable voting in my election technology course, I talk myself out of the idea a little bit more. Very neat in principle, but I think inherently vulnerable in practice to disinformation, at least until more intuitive and simpler schemes are developed.

(E2E verifiable voting uses a lot of fancy cryptography to allow voters and the public to confirm that votes were counted correctly without revealing how any individual voted)

In particular, because these schemes have voters take home a unique code associated with their votes, I worry they open the door to a malicious party claiming (falsely) that the ballot isn't secret. Refuting those claims requires explaining an enormous amount of fairly advanced math, and still rests on some potentially dubious assumptions.

It's an example where improving something may end up reducing trust in it.

Look who’s come crawling back… (it’s me, I’ve come crawling back)

If your Open Source project sees a steep increase in number of high quality security reports (mostly done with AI) right now (#curl, Linux kernel, glibc confirmed) please tell me the name of this project.

(I'd like to make a little list for my coming talk on this.)

The prompt strings have an odd narrative/narrator structure. It sort of reminds me of Bakhtin's discussion of polyphony and narrator in Dostoevsky - there is no omniscient narrator, no author-constructed reality. narration is always embedded within the voice and subjectivity of the character. this is also literally true since the LLM is writing the code and the prompts that are then used to write code and prompts at runtime.

They also read a bit like a Philip K Dick story, paranoid and suspicious, constantly uncertain about the status of one's own and others identities.

So the reason that Claude code is capable of outputting valid json is because if the prompt text suggests it should be JSON then it enters a special loop in the main query engine that just validates it against JSON schema (it looks like the schema just validates that something in fact and object and its keys are strings) and then feeds the data with the error message back into itself until it is valid JSON or a retry limit is reached.

This code is so eye wateringly spaghetti so I am still trying to see if this is true, but this seems to be how it not only returns json to the user, but how it handles all LLM-to-JSON, including internal output from its tools. There appears to be an unconditional hook where if the JSON output tool is present in the session config at all, then all tool calls must be followed by the "force into JSON" loop.

If that's true, that's just mind blowingly expensive

edit: please note that unless I say otherwise all evaluations here are just from my skimming through the code on my phone and have not been validated in any way that should cause you to be upset with me for impugning the good name of anthropic

edit2: this is both much worse and not as bad as i thought on first read - https://neuromatch.social/@jonny/116326861737478342

jonny (nonvenomous) (@[email protected])

Attached: 3 images OK i can't focus on work and keep looking at this repo. So after every "subagent" runs, claude code creates *another* "agent" to check on whether the first "agent" did the thing it was supposed to. I don't know about you but i smell a bit of a problem, if you can't trust whether one "agent" with a very big fancy model did something, how in the fuck are you supposed to trust another "agent" running on the smallest crappiest model? That's not the funny part, that's obvious and fundamental to the entire show here. HOWEVER RECALL [the above JSON Schema Verification thing](https://neuromatch.social/@jonny/116325123136895805) that is unconditionally added onto the end of every round of LLM calls. the mechanism for adding that hook is... JUST FUCKING ASKING THE MODEL TO CALL THAT TOOL. second pic is registering a hook s.t. "after some stop state happens, if there isn't a message indicating that we have successfully called the JSON validation thing, prompt the model saying "you must call the json validation thing" this shit sucks so bad they can't even ***CALL THEIR OWN CODE FROM INSIDE THEIR OWN CODE.*** Look at the comment on pic 3 - "e.g. agent finished without calling structured output tool" - that's common enough that they have a whole goddamn error category for it, and the way it's handled is by just pretending the job was cancelled and nothing happened.

neurospace.live
  • Claude code source "leaks" in a mapfile
  • people immediately use the code laundering machines to code launder the code laundering frontend
  • now many dubious open source-ish knockoffs in python and rust being derived directly from the source

What's anthropic going to do, sue them? Insist in court that LLM recreating copyrighted code is a violation of copyright???