First person to prompt-inject virtual Zuck into giving them a billion dollar bonus gets a billion dollars!
Oh, what's that, Meta doesn't trust virtual Zuck to handle real money? Why is that, too menial for a superintelligence? š¤
First person to prompt-inject virtual Zuck into giving them a billion dollar bonus gets a billion dollars!
Oh, what's that, Meta doesn't trust virtual Zuck to handle real money? Why is that, too menial for a superintelligence? š¤
RE: https://esq.social/@D_J_Nathanson/116420731934704356
Today's #AIIsGoinGreat: It's easy to imagine a product like this "fixing" the hallucinated citation problem by using non-AI code to look up citations in Westlaw's database, and nagging the model to fix ones that don't check out. Which will get you valid citations, but unfortunately for Westlaw and the #ChatGPTLawyer in this case, verifying the citation actually supports the thing it's cited for is an entirely different and much harder problem
https://mastodon.social/@D_J_Nathanson@esq.social/116420732051053223
I on the other hand predict that a great deal of entertainment will ensue!
(for outsiders watching the trainwreck unfold)
https://mastodon.social/@gamingonlinux/116453942800480300
"To reach that bare minimum of 7 percent, Gartner forecasts that large AI companies would need to earn cumulatively close to $7 trillion in AI-driven revenue through 2029" - It's OK, I'm sure some banner ads and horny chatbots will cover it
Google's search for in-the-wild prompt injection involved some regexes and⦠feeding the content to an #LLM? 𤨠"These candidates were then processed by Gemini to classify the intent of the suspicious text, and to understand whether they were part of the overall document narrative or suspiciously out of place"
and no, they do not discuss whether Gemini was successfully prompt-injected by the any of the content it examined
More seriously, they assess that most of what they found was jokey and/or low sophistication but there's no discussion of whether they encountered anything likely to succeed against commonly used AI tools
https://security.googleblog.com/2026/04/ai-threats-in-wild-current-state-of.html
Today's #AIIsGoingGreat (HT everyone) is of course the sloperator who vibe-deleted their prod database⦠and also somehow took at face value the "confession" of the bot which (per their story, at least) deleted their database. Notably they quote the "confession" verbatim but not the prompt that triggered it
"frontier models fail to accurately predict their own token usage (with weak-to-moderate correlations, up to 0.39) and systematically underestimate real token costs" - Approaching parity with human programmers' cost/schedule estimation!

The wide adoption of AI agents in complex human workflows is driving rapid growth in LLM token consumption. When agents are deployed on tasks that require a significant amount of tokens, three questions naturally arise: (1) Where do AI agents spend the tokens? (2) Which models are more token-efficient? and (3) Can agents predict their token usage before task execution? In this paper, we present the first systematic study of token consumption patterns in agentic coding tasks. We analyze trajectories from eight frontier LLMs on SWE-bench Verified and evaluate models' ability to predict their own token costs before task execution. We find that: (1) agentic tasks are uniquely expensive, consuming 1000x more tokens than code reasoning and code chat, with input tokens rather than output tokens driving the overall cost; (2) token usage is highly variable and inherently stochastic: runs on the same task can differ by up to 30x in total tokens, and higher token usage does not translate into higher accuracy; instead, accuracy often peaks at intermediate cost and saturates at higher costs; (3) models vary substantially in token efficiency: on the same tasks, Kimi-K2 and Claude-Sonnet-4.5, on average, consume over 1.5 million more tokens than GPT-5; (4) task difficulty rated by human experts only weakly aligns with actual token costs, revealing a fundamental gap between human-perceived complexity and the computational effort agents actually expend; and (5) frontier models fail to accurately predict their own token usage (with weak-to-moderate correlations, up to 0.39) and systematically underestimate real token costs. Our study offers new insights into the economics of AI agents and can inspire future research in this direction.
A technology so compelling and transformative it needs a billionaire-backed astroturf campaign to promote it
#AIIsGoingGreat who could have predicted that an always-on sycophantic delusion machine could send vulnerable people into delusional spirals?
"The White House has asked a group of tech companies to answer a set of questions this week about how to ward off digital attacks that frontier artificial intelligence tools could soon enable" - which sounds like an invitation to burn taxpayer billions on AI cyber snake oil but⦠"The four people said some industry representatives were confused by the questions they received, several of which were seen as vague"
https://www.politico.com/news/2026/04/30/white-house-ai-cyber-threats-mythos-00902045
#AIIsGoingGreat "Banks are hunting for new ways to offload risks tied to a glut of data centre debt as the race to build AI infrastructure stretches financing limits among the largest global lenders ⦠Lenders, including JPMorgan and MUFG, have spent more than six months distributing $38bn of construction debt tied to a data centre project leased to Oracle in Texas and Wisconsin"
https://www.ft.com/content/08aba5e4-5834-4e79-a48d-989a2c5bad0f
#AIIsGoingGreat "An acclaimed Canadian fiddle player has launched a $1.5m civil lawsuit against Google, alleging that the online giant defamed him by falsely identifying him as a sex offender in an AI-generated summary of his life and career⦠he had learned of the inaccurate information when the Sipekneākatik First Nation cancelled a concert appearance planned for 19 December, after members of the public complained, citing the misinformation they read on Google"
Also:
1) The Sipekneākatik First Nation later issued a public apology to MacIsaac, saying: āDecisions were based on incorrect information generated through an AI-assisted search, which mistakenly associated you with offenses unrelated to you. We deeply regret the harm this caused to your reputation and livelihood.ā
2) MacIsaacās lawsuit alleges that Google had never contacted him or offered an apology over the error
Given the fairly irrefutable concrete harm involved, I predict google will settle, and the "can tech megacorps be held liable for the things their BS machines say" question will be kicked down the road again
RE: https://infosec.exchange/@agreenberg/116533336872355044
Democratizing Software Development⢠is going great
https://mastodon.social/@agreenberg@infosec.exchange/116533336935148460
RE: https://mastodon.online/@AstroMikeHudson/116532732772011276
Also good from this piece: "What AI companies want is the financial upside of mass adoption without the ordinary obligations that come with selling something that malfunctions"
https://mastodon.social/@AstroMikeHuds[email protected]/116532732761820735
'In a statement provided to Ars, [OpenAI spokesperson] Drew Pusateri, described Nelsonās death as a āheartbreaking situationā and expressed that āour thoughts are with the family.ā However, Pusateri also emphasized that the ChatGPT model implicated is āno longer availableā and suggested that current models are safer' - Ah yes, I'm sure that will be a huge comfort to the parents whose kid died following the advice of the old model
Supplemental #AIIsGoingGreat (ht @deborahh¹) Ontario auditor general examines AI medical transcription bots and finds most of those approved for use in the province produce āincorrect information, AI hallucinations and incomplete informationā such as "recorded a different drug than what was prescribed" and "fabricated information and made suggestions to patientsā treatment plans"
¹ https://mastodon.social/@deborahh@cosocial.ca/116563784830411901
The test was part of what appeared to be a largely pro forma prequalification to allow vendors to be on the approved vendor list, accuracy only amounted to 4% of the score, and there was no minimum score for the accuracy component. The auditor dinged Supply Ontario for the latter two. Nothing suggests any vendors were required to make changes, and as we all know, the industry does not know to fix the underlying problems
https://www.auditor.on.ca/en/content/specialreports/specialaudits/en2026/AR_2026_AI_EN.html
The auditor also dinged Supply Ontario for not actually requiring the vendors to demonstrate the product. They just sent a recording and required vendors to pinky swear the transcript was from the product
(OTOH, the terrible accuracy and lack of penalty for it may be a good indication that most of the vendors didn't cheat)
https://www.auditor.on.ca/en/content/specialreports/specialaudits/en2026/AR_2026_AI_EN.html
A technology so transformative and compelling that curators of one of the most successful repositories of scientific knowledge say "use it once and you're outta here"
(sight exaggeration: it's actually only for slop related misconduct like fake citations or things that were obviously not proof read by a human, not use per se)
https://www.404media.co/new-arxiv-rules-ai-generated-papers-ban/
I think ariXiv's policy is good, but I predict there will be disputes over what constitutes "incontrovertible evidence" and at least one butthurt sloperator will file a (completely meritless) "but muh free speech" lawsuit
https://www.404media.co/new-arxiv-rules-ai-generated-papers-ban/
A technology so transformative and compellingā¦
Fresh #ChatGPTLawyer
Shot: 'In a 2025 blog discussing the case, founder Marc Trent confirmed that the firm had āutilized our tech team to draftā the initial complaint. He boasted that the āevolvedā firm uses āeverything related to AI now,ā suggesting that āeven Meta canāt beat usā '
Chaser: 'a senior circuit judge for the [7th circuit], wrote that the three-judge panel agreed that āthis is a relatively rare appeal in which sanctions appear to be appropriate.ā'
Client appears to be a total scumbag, so I'm not really seeing much downside here: "[Nikko] DāAmbrosio legal fight started when a woman whom he briefly dated ⦠blocked his number, and he persisted in sending a menacing text by using an alternate number ⦠[the woman] posted a screenshot of the text in a thread where more than two dozen women started sharing photos of DāAmbrosio and criticizing him ⦠DāAmbrosio failed to allege any concrete harm caused by the post"
In today's #AIIsGoingGreat (ht @jalefkowit*) NYT finds Steven Rosenbaum's book on AI "The Future of Truth" is packed with slop⦠His response? it "serves as a warning about the risks of A.I.-assisted research and verification, that is why I wrote the book. These A.I. errors do not, in fact, diminish the larger questions that the book raises about truth, trust and A.I. and its impact on society, democracy and editorial"
* https://mastodon.social/@jalefkowit@vmst.io/116602112873534348
"You can imagine, for example, how a question about black holes in space could lead to an interactive visual that brings the concept to life, Reid said, adding that users can then ask follow-up questions and see Google respond with brand-new visuals in real time" - Oh yeah, I'm sure the machine that has trouble counting the number times a letter appears in words will accurately portray relativistic physics
https://techcrunch.com/2026/05/19/google-search-as-you-know-it-is-over/
Anyway, a big chunk of the world population having one of their primary information sources replaced by a blender full of BS will probably go great
https://techcrunch.com/2026/05/19/google-search-as-you-know-it-is-over/
Today's #AIIsGoingGreat (ht @gregeganSF*) is a great illustration of an LLM producing an analysis-shaped-thing. It sounds like the kind of result one could get and fits popular stereotypes, but as it turns out, has no basis in the data provided
https://kucharski.substack.com/p/real-signals-or-artificial-stereotypes
https://mastodon.social/@gregeganSF@mathstodon.xyz/116606390899220818
"Rosenbaum said he had recently asked an AI tool to extract his 'no changes, verbatim' speakerās notes out of a slide deck so he could use them for an upcoming presentation. He was about to print those extracted notes when he realized that the LLM had actually rewritten his words despite his 'very clear instructions for the robot.'" - Guy managed to write a whole book* on AI and is surprised it made shit up?
* or most of it https://mastodon.social/@reedmideke/116604415316787515
"On the Diia web portal, the AI assistant generates income certificates. In the app, users can chat with an AI agent to obtain residence certificates for adults or children and pay traffic fines in minutes" š¬ Digitization is generally good, but it sure seems like there's a lot of ways that could go wrong if "AI" is substantively involved in those processes

When the world watches a country enduring unprecedented historical trials, the expectation is usually that every effort will be consumed by basic survival ā that the machinery of government will be running on fumes just to hold itself together. In Ukraine, we are challenging that expectation by, in the midst of the largest armed conflict in Europe since World War II, not merely keeping our institutions afloat, but fundamentally reimagining what they are ā making them more agile and more respons
Bet if we put this one up on the shelf to age, we'll end up with some primo vintage #AIIsGoingGreat: "Robinhood unveiled tools on Wednesday that let AI agents trade stocks and make purchases on usersā behalf"