Mastodawn

Suppose you construct a Mechanical Turk AI who plays ARC-AGI-3 by, for each task, randomly selecting one of the human players who attempted it, and scoring them as an AI taking those same actions would be scored. What score does this Turk get? It must be <100% since sometimes the random human will take more steps than the second best, but without knowing whether it's 90% or 50% it's very hard for me to contextualize AI scores on this benchmark.

Show thread

Imnimo Mar 13

I think the problem for xAI is that it can really only hire two types of researchers - people who are philosophically aligned with Elon, and people who are solely money-motivated (not a judgment). But frontier AI research is a field with a lot of top talent who have strong philosophical motivation for their work, and those philosophies are often completely at odds with Elon. OpenAI and Anthropic have philosophical niches that are much better at attracting the current cream of the crop, and I don't really see how xAI can compete with that.

Official	https://
Support this service	https://www.patreon.com/birddotmakeup