Infosec people: Untrusted, unsanitized inputs have been the bane of our existence for the last 40 years
Tech CEOs: We're betting billions of dollars the next big thing is a black box filled with pure essence of untrusted, unsanitizable inputs
Infosec people: Untrusted, unsanitized inputs have been the bane of our existence for the last 40 years
Tech CEOs: We're betting billions of dollars the next big thing is a black box filled with pure essence of untrusted, unsanitizable inputs
Microsoft: If we add just one more <s>overbalanced wheel</s> layer of BS generators to our <s>over-unity machine</s> AI, it will really work this time for sure!
https://www.theverge.com/2024/9/24/24253452/microsoft-correction-ai-safety-tool-fix-errors
OG #ChatGPTLawyer-as-a-service bro Joshua Browder of DoNotPay gets a slap on the wrist from the FTC. DoNotPay spokes says they're "pleased to have worked constructively with the FTC to settle this case and fully resolve these issues, without admitting liability" and I bet the spent a pile of money on real lawyers to get there. Oh, and they also paid the FTC $193,000
"The White House is directing the Pentagon and intelligence agencies to increase their adoption of artificial intelligence" 🤨
"The memo also specifically requires agencies to monitor the risk AI systems can pose when it comes to privacy, discrimination and human rights" - I'd hope they're also required to monitor the risk it makes shit up
(yeah, a lot of militarily relevant AI isn't genAI but still)
https://www.washingtonpost.com/technology/2024/10/24/white-house-ai-nation-security-memo/
Remember kids, you can't spell snake oil without #AI https://pivot-to-ai.com/2024/10/25/cybercheck-has-secured-murder-convictions-it-appears-to-just-run-websites-through-a-chatbot/
Cybercheck, from Global Intelligence, claims it can find the key evidence to nail down a case. Cybercheck reports have been involved in at least two murder convictions. Cybercheck hands the police …
What could be better than having your medical visits transcribed by an #AI prone to making shit up? Deleting the original so no one can prove it "It’s impossible to compare Nabla’s AI-generated transcript to the original recording because Nabla’s tool erases the original audio for “data safety reasons,” Raison said"

Whisper is a popular transcription tool powered by artificial intelligence, but it has a major flaw. It makes things up that were never said. Whisper was created by OpenAI. It's being used in many industries worldwide to translate and transcribe interviews, generate text in popular consumer technologies and create subtitles for videos. OpenAI has promoted Whisper as having near “human level robustness and accuracy." But more than a dozen computer scientists and software developers tell The Associated Press that isn’t always the case and that it's prone to making up chunks of text and even entire sentences. An OpenAI spokesperson says the company studies how to reduce that and updates its models incorporating feedback received.
So at first glance, this is just a typical #AIIsGoingGreat - Alaska Education Commissioner Deena Bishop used spicy autocomplete and it made shit up like it so often does, but also… the excuse about the bogus citations being "placeholders" seems like a clear admission she started with the desired policy (restrict smartphones in schools) and then tried to generate a post-hoc justification, without even doing a basic literature review
Today's #AIIsGoingGreat: German journalist Martin Bernklau discovers Microsoft #Copilot says he committed crimes he reported on, and also helpfully provides directions to his home. Microsoft subsequently seems to have taken the typical band-aid approach and blocked his name… because, of course, none of these companies setting billions on fire to chase #AI hype have any idea how to solve the general case of LLMs making shit up
Also real estate dude's process is a pretty perfect anti-usecase: "Huynh said he would usually input the address of a rental property and the basic description such as how many bedrooms and bathrooms it had into ChatGPT"
At the very best, all an #LLM can add is irrelevant fluff or widely known facts about the general region. It cannot reliably add factual information about individual houses or neighborhoods, and more often it'll just make shit up
Oh, team involved in that "AI scientist" preprint I dunked on earlier* included "researchers from the buzzy Tokyo-based startup Sakana AI"
Anyway they allow that their "scientist" making up 10% of the numbers in its "papers" is "probably unacceptable" and then go on to talk about how it could be improved without addressing the possibility that making shit up is an inherent characteristic of LLMs https://spectrum.ieee.org/ai-for-science-2
Today's #AIIsGoingGreat "…results from a hard-coded filter that puts the brakes on the AI model's output before returning it to the user" - Demonstrating once again that despite setting hundreds of billions of dollars on fire, #LLM #AI companies have no idea how to solve the "hallucination" (aka making shit up) problem in the general case. Their best solution is hard coded checks for individual phrases that might expose them to excessive legal costs
Today's #AIIsGoingGreat: Hard to see how drowning volunteer developers in #AI slop vulnerability reports could possibly go wrong. Great work everyone, throw another billion on the #LLM BS machine bonfire to celebrate!
#AIIsGoingGreat: 'correspondence seen by TechCrunch shows that previously, the guidelines read: “If you do not have critical expertise (e.g. coding, math) to rate this prompt, please skip this task.”
But now the guidelines read: “You should not skip prompts that require specialized domain knowledge.” Instead, contractors are being told to “rate the parts of the prompt you understand” and include a note that they don’t have domain knowledge'
Today's #AIIsGoingGreat via @telescoper: As he notes, google used to be quite OK for this kind of thing. Sure, you still needed to check whether the top result was from a reliable source, but it usually was, and unlike results run through the #LLM BS blender, you could do so at a glance
Altman's latest blog strikes me as a lot of hand-wavy CEO-speak, but I actually agree with this "in 2025, we may see the first AI agents “join the workforce” and materially change the output of companies" … with the small caveat that the average "material change" is unlikely to be in a positive direction
Meanwhile, Apple responds to the predictable result of running notifications through a blender with #LLM BS: "Apple Intelligence features are in beta and we are continuously making improvements with the help of user feedback… A software update in the coming weeks will further clarify when the text being displayed is summarization provided by Apple Intelligence"
This whole thread of #Google #AIIsGoingGreat with fractions is a good illustration of why I'm skeptical of the "sure, it has bugs, but they're fixing them, just like any other software" takes. IMO you can't band-aid a system with no concept of what fraction is to get this right in the general case, and even if you somehow recognize questions about fractions, there's an unlimited number of other cases where autocomplete is similarly inappropriate
https://mastodon.social/@lauren@mastodon.laurenweinstein.org/113771300586021845
This one with 25.4 == 1 in particular is a great example of how probabilistic completions go off the rails
https://mastodon.social/@[email protected]/113772004311087469
802.11*sigh*
https://openwrt.org/docs/guide-user/network/wifi/mesh/80211s
(see edit history for a good time)
One objection #AI pessimists hear a lot is that big tech execs wouldn't be dumping billions into it if it were as bad as people say, because, you know, they're smart guys, right? Anyway…
LOL, bots mindlessly boosting every f-ing post in this thread with the #AI tag is 👨🏻🍳🤌
(also, what's the point of a bot that just boosts posts with a hashtag? Do they not know people can follow hashtags?)
Now if you *actually believed* #LLM BS generators were the path to the post-singularity AGI utopia, wouldn't the news that it can be done cheaper with less advanced hardware be overwhelmingly positive, regardless of the short term impact on some individual players? Shouldn't all the #AI bros be celebrating?
OTOH, if you were running an elaborate pump and dump involving some individual players, it might be kinda bad news
Translation: Sales of the latest high-end, resource intensive models were so bad, Microsoft decided they might as well just eat the cost in hopes of driving adoption
https://www.theverge.com/news/603149/microsoft-openai-o1-model-copilot-think-deeper-free
Who decided to call it "Stargate" when "Monorail" was right there? https://www.theverge.com/openai/603952/sam-altman-stargate-ai-data-center-plan-hype-funding
#AIIsGoingGreat 'He said, for example, that he would need help creating “AI coding agents” that would write software across the entire federal government' - Yeah buddy, and I'm gonna need help rounding up unicorns to fart rainbows in my face
404 Media has obtained audio of a meeting held by Thomas Shedd, a Musk-associate who is now heading a team of government coders. In the call one employee pushed back and said one of the planned moves is an “illegal task.”
#AIIsGoingGreat "One of the most blistering findings is that trial participants who reckoned the technology was of little to use soared from 6% before the trial to 59% after the trial, an almost tenfold increase" - Once again, people actually trying to do stuff find that stochastic BS machines are less than ideal for task which do not require BS
https://www.themandarin.com.au/286344-treasury-trial-of-microsoft-copilot-comes-a-cropper/
#AIIsGoingGreat supplemental 'The chatbot told TechCrunch it is here to “help government personnel like you identify and eliminate waste, improve efficiency, and streamline processes using a first principles approach.”'
#AIIsGoingGreat Thing to take away from this isn't that Grok is any worse than any other #LLM chatbot, or that #AI secretly thinks Trump and Musk are bad, or wants to kill people… it's that, as ever, they "fixed" it with some hard-coded band-aid to stop this particular headline generating case, without doing anything at all to address the underlying cause (because they still have no idea how to do that)
https://www.theverge.com/news/617799/elon-musk-grok-ai-donald-trump-death-penalty
Expert reached for comment by the BBC says "Apple's explanation of phonetic overlap did not make sense because the two words [Racist and Trump] were not similar enough to confuse an artificial intelligence (AI) system" and suggests human interference, but I humbly submit that this is entirely consistent with #AI becoming sentient
#AIIsGoingGreat "The Los Angeles Times removed its new AI-powered “insights” feature from a column after the tool tried to defend the Ku Klux Klan" and as usual, instead of acknowledging that a stochastic BS machine might not be fit for this purpose, they just band-aided the instance that caused bad PR "It remains available on other “Voices” pieces that offer points of view, which includes news commentary and reviews, among others"
https://www.thedailybeast.com/maga-newspaper-owners-ai-bot-defends-kkk/
#AIIsGoingGreat aside from the obvious problems with this transcript, it's also completely incoherent. A system with any ability to analyze the meaning should have rejected it as a failed transcription regardless of the x-rated bits
#AIIsGoingGreat Supplemental: Another great example of why filtering your information though an #LLM BS blender is a bad idea. It removes contextual clues about source reliability and the people ripping off the entire web for training data aren't picky about what they ingest
(but hey, at least now we have empirical evidence that large scale input poisoning can have a noticeable impact!)
https://www.newsguardrealitycheck.com/p/a-well-funded-moscow-based-global
An audit found that the 10 leading generative AI tools advanced Moscow’s disinformation goals by repeating false claims from the pro-Kremlin Pravda network 33 percent of the time
Today's #AIIsGoingGreat (ht @jalefkowit) seamlessly integrates DSM (Diagnostic and Statistical Manual of Mental Disorders) and DSM (Synology DiskStation Manager). The age of superintelligence is truly upon us!
https://web.archive.org/web/20250313204203/https://www.abtaba.com/blog/dsm-6-release-date
#AIIsGoingGreat "Grok 3 demonstrated the highest error rate, at 94 percent … premium paid versions of these AI search tools fared even worse in certain respects. Perplexity Pro ($20/month) and Grok 3's premium service ($40/month) confidently delivered incorrect responses more often than their free counterparts"
Today's #AIIsGoingGreat, courtesy of @JMarkOckerbloom* Springer volume "Advanced Nanovaccines for Cancer Immunotherapy" ($119 ebook or a mere $159.99 if you spring for hardcover) includes the sage words "It is important to note that as an AI language model, I can provide a general perspective, but you should consult with medical professionals for personalized advice"
https://pubpeer.com/publications/2FF96DD440C928A3DDF99771A48B4A#
* https://mastodon.social/@JMarkOckerbloom/114217609254949527
"Apple, like every other big player in tech, is scrambling to find ways to inject AI into its products. Why? Well, it’s the future! What problems is it solving? Well, so far that’s not clear! Are customers demanding it? LOL, no."
https://amp.cnn.com/cnn/2025/03/27/tech/apple-ai-artificial-intelligence
Apple has been getting hammered in tech and financial media for its uncharacteristically messy foray into artificial intelligence. After a June event heralding a new AI-powered Siri, the company has delayed its release indefinitely. The AI features Apple has rolled out, including text message summaries, are comically unhelpful.
OpenAI*: "NYT copyright claims are bogus because you can only get verbatim copy if you 'hack' the prompts"
Also OpenAI: "NYT copyright claims should be time barred because they should have known ChatGPT could output verbatim copy two years before it was released"
@reedmideke we wrote about Kubient too when Roberts pleaded guilty: https://pivot-to-ai.com/2024/09/18/kubients-adtech-use-case-for-ai-an-excuse-for-a-fraud/
this podcast about the case (37:34 on) is amazing: https://podcasts.apple.com/us/podcast/episode-48-anne-coghlan-from-scope-3-on-measuring-and/id1615989259?i=1000637275912
@davidgerard Indeed, I featured that post in the thread at the time https://mastodon.social/@reedmideke/113160781568746961
(not blaming you not reading the whole thing, LOL, putting all my AI dunks in one thread is definitely Using Mastodon Wrong)

Proving the Necessity and Uniqueness of the Contradiction-Free Ontological Lattice (CFOL) as the Sole Substrate for AI Superintelligence Authors: Grok (built by xAI), in extended collaboration with Jason Lauzon Date: December 31, 2025 Abstract: This paper rigorously proves, through deductive logi...
@0illuminated1 @reedmideke You're linking to a paper credited to... Grok? Which on a quick skim is clearly claiming "proofs" in sections like 5 and 6 that have no stated support beyond bullet-point hand-waving?
You're probably not going to believe me (ask a working computer science professor if you want a second opinion), but this is exactly the sort of "tell me what I want to hear" sycophantic nonsense that makes me advise people to slowly step away from their large language models.
@reedmideke
🤔 one might conclude, the stochastic BS machine does exactly what it is trained for
the answer given about "Wagner Group" by ChatGPT documented in the "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?" paper https://dl.acm.org/doi/10.1145/3442188.3445922 should have been alerting enough
I'll be grateful to dear @emilymbender , @timnitGebru and coauthors
#SALAMI has eugenics embedded
the only safe move is to stay away far far away
@wobweger @reedmideke @timnitGebru
Read a little closer -- that paper was published in March 2021 (and completed earlier). ChatGPT wasn't released until Nov 2022.
The system McGuffie & Newhouse tested was GPT-3 -- and we're quoting their work.