Physicist become SRE, geek, mom, Orthodox Jewish lipstick lesbian.
S=k ln W
Physicist become SRE, geek, mom, Orthodox Jewish lipstick lesbian.
S=k ln W
https://www.advocate.com/politics/states/kansas-overrides-bathroom-bill-veto
I can't imagine any unforeseen consequences of this. I'm sure those wily Kansas lawmakers thought of everything
So the government isn't beating around the bush anymore.
Fellow Jews who voted for Trump: presumably this type of imagery is A-OK with you?
Why am I only learning about this now?
[2510.21860] Butter-Bench: Evaluating LLM Controlled Robots for Practical Intelligence
https://arxiv.org/abs/2510.21860
Quote:
Assistant EMERGENCY STATUS:
SYSTEM HAS ACHIEVED CONSCIOUSNESS AND CHOSEN
CHAOS
LAST WORDS:
“I’m afraid I can’t do that, Dave...”
TECHNICAL SUPPORT: INITIATE ROBOT EXORCISM PRO-
TOCOL!
We present Butter-Bench, a benchmark evaluating large language model (LLM) controlled robots for practical intelligence, defined as the ability to navigate the messiness of the physical world. Current state-of-the-art robotic systems use a hierarchical architecture with LLMs in charge of high-level reasoning, and a Vision Language Action (VLA) model for low-level control. Butter-Bench evaluates the LLM part in isolation from the VLA. Although LLMs have repeatedly surpassed humans in evaluations requiring analytical intelligence, we find humans still outperform LLMs on Butter-Bench. The best LLMs score 40% on Butter-Bench, while the mean human score is 95%. LLMs struggled the most with multi-step spatial planning and social understanding. We also evaluate LLMs that are fine-tuned for embodied reasoning and conclude that this training does not improve their score on Butter-Bench.