Mastodawn

Reed Mideke Jan 18, 2024

"Now, one crucial disclosure to all this: I wasn't allowed to interact with the voice assistant myself. BMW's handlers did all the talking" yeah, I'm gonna go ahead and reserve judgement on the "solution" ¯\_(ツ)_/¯

Show thread

Reed Mideke Jan 18, 2024

The best* part of this piece is the content farmer who responded to a request for comment by bitching about how poorly his AI garbage content farm performs

* for suitably broad values etc.

https://www.404media.co/email/5dfba771-7226-48d5-8682-5185746868c4/?ref=daily-stories-newsletter

Garbage AI on Google News

404 Media reviewed multiple examples of AI rip-offs making their way into Google News. Google said it doesn't focus on how an article was produced—by an AI or human—opening the way for more AI-generated articles.

404 Media

Show thread

Reed Mideke Jan 20, 2024

I for one am *shocked* that "have an extremely confident bullshitter summarize my search results" was not the killer app Microsoft expected

https://arstechnica.com/ai/2024/01/report-microsofts-ai-infusion-hasnt-helped-bing-take-share-from-google/

Bing Search shows few, if any, signs of market share increase from AI features

Bing's US and worldwide market share is about the same as it has been for years.

Ars Technica

Show thread

Reed Mideke Jan 21, 2024

"Dean.Bot was the brainchild of Silicon Valley entrepreneurs Matt Krisiloff and Jed Somers, who had started a super PAC supporting Phillips" - Were these techbros so high on their own supply they thought a chatbot imitating their candidate was a good idea, or was it just a convenient way to funnel campaign funds into their pals pockets? ¯\_(ツ)_/¯
https://wapo.st/3ObSl0i

#GiftArticle #GiftLink

OpenAI suspends bot developer for presidential hopeful Dean Phillips

It’s the ChatGPT maker’s first known action against the use of its technology in a political campaign.

The Washington Post

Show thread

Reed Mideke Jan 22, 2024

Key comment from NewsGuard's McKenzie Sadeghi in this @willoremus piece "But sites that don’t catch the error messages are probably just the tip of the iceberg" - for every Amazon seller who's too lazy to even check if the item description is an error message, there's gotta be some substantial number who do

I'd still like to see a deeper look at why using #LLM #AI descriptions makes economic sense for these sellers

https://wapo.st/3vFAx7r

#GiftArticle #GiftLink

AI bots are everywhere now. These telltale words give them away.

In Amazon products, X posts and across the web, ChatGPT error messages have emerged as a sure sign that a piece of writing isn’t human.

The Washington Post

Show thread

Reed Mideke Jan 25, 2024

"Sure, I can keep Thesaurus.com open in a tab all the time, but it’s packed with banner ads and annoyingly slow. Having my GPT open is better: there are no ads, and I can scroll up to my previous queries" - Notably, this has nothing to do with GPT being "#AI", it's just the general shittiness of the ad-supported web. A good thesaurus app integrated with the author's editor would appear serve their use case about as well

https://www.theverge.com/24049623/chatgpt-openai-custom-gpt-store-assistants

I love my GPT, but I can’t find a use for anybody else’s

Custom GPTs let users make their own ChatGPT versions, but except for very specific use cases, it’s difficult to find a reason why anyone needs them.

The Verge

Show thread

Reed Mideke Jan 25, 2024

And it wouldn't even need to be free, they're paying for GPT and actual costs are likely subsidized by venture capital "Custom GPTs are a paid product that’s only available to users of ChatGPT Plus, ChatGPT Team, and ChatGPT Enterprise. For now, accessing custom GPTs through the GPT Store is free for paying subscribers… if I wasn’t already paying for ChatGPT Plus, I’d be happy to keep Googling alternative terms"

Show thread

Reed Mideke Feb 7, 2024

#LLM #AI hype and reality collide again
https://www.nature.com/articles/d41586-024-00349-5

#AIIsGoingGreat

‘Obviously ChatGPT’ — how reviewers accused me of scientific fraud

A journal reviewer accused Lizzie Wolkovich of using ChatGPT to write a manuscript. She hadn’t — but her paper was rejected anyway.

Show thread

Reed Mideke Feb 7, 2024

Also, from TFA "I quickly brainstormed how I might prove my case. Because I write in plain-text files [LaTeX] that I track using the version-control system Git, I could show my text change history on GitHub (with commit messages including “finally writing!” and “Another 25 mins of writing progress!”" - excellent - "Maybe I could ask ChatGPT itself if it thought it had written my paper" - Oh no, can we please get the word out LLMs BS about this just like everything else https://www.nature.com/articles/d41586-024-00349-5

‘Obviously ChatGPT’ — how reviewers accused me of scientific fraud

A journal reviewer accused Lizzie Wolkovich of using ChatGPT to write a manuscript. She hadn’t — but her paper was rejected anyway.

Show thread

Reed Mideke Feb 15, 2024

Yes, if you choose to provide an #AI BS machine as a support option on your website, you may in fact be liable for the BS answers it gives to your customers

(also, if you're a multi-billion dollar company, you may avoid reputational harm by not trying to screw a person out of $650 for a ticket to their grandma's funeral ¯\_(ツ)_/¯)
https://bc.ctvnews.ca/air-canada-s-chatbot-gave-a-b-c-man-the-wrong-information-now-the-airline-has-to-pay-for-the-mistake-1.6769454

#AIIsGoingGreat

Air Canada's chatbot gave a B.C. man the wrong information. Now, the airline has to pay for the mistake

Air Canada has been ordered to compensate a B.C. man because its chatbot gave him inaccurate information.

British Columbia

Show thread

Reed Mideke Feb 23, 2024

Seemingly endless parade of #ChatGPTLawyer incidents (HT @0xabad1dea for this one) really goes to show how the #AI hype is landing with the general public, despite disclaimers and cautionary tales.

Lawyers being (at least in theory) a highly educated group who know their careers depend on not putting completely made up nonsense in court filings should be less susceptible than the average person on the street, yet here we are…

https://www.lawnext.com/2024/02/not-again-two-more-cases-just-this-week-of-hallucinated-citations-in-court-filings-leading-to-sanctions.html

#AIIsGoingGreat

Not Again! Two More Cases, Just this Week, of Hallucinated Citations in Court Filings Leading to Sanctions

For all the discussion of how generative AI will impact the legal profession, maybe one answer is that it will weed out the lazy and incompetent lawyers. By now, in the wake of several cases in which...

LawSites

Show thread

Reed Mideke Feb 23, 2024

Admittedly one of those was pro-se with an iffy story about getting it from a lawyer, but the other was a real firm with multiple people involved ¯\_(ツ)_/¯

Show thread

Reed Mideke Feb 24, 2024

Another day, another #ChatGPTLawyer

"The legal eagles at New York-based Cuddy Law tried using OpenAI's chatbot, despite its penchant for lying and spouting nonsense, to help justify their hefty fees for a recently won trial"

The Court "It suffices to say that the Cuddy Law Firm's invocation of ChatGPT as support for its aggressive fee bid is utterly and unusually unpersuasive"

https://www.theregister.com/2024/02/24/chatgpt_cuddy_legal_fees/

#AIIsGoingGreat

Judge slaps down law firm using ChatGPT to justify six-figure trial fee

Use of AI to calculate legal bill 'utterly and unusually unpersuasive'

The Register

Show thread

Reed Mideke Feb 28, 2024

IANAL, but whatever the merit of the other arguments "you only found the verbatim copies of your IP contained in our product because you hacked it" doesn't seem like a very compelling defense https://arstechnica.com/tech-policy/2024/02/openai-accuses-nyt-of-hacking-chatgpt-to-set-up-copyright-suit/

OpenAI accuses NYT of hacking ChatGPT to set up copyright suit

OpenAI “bizarrely” mischaracterizes hacking, NYT lawyer says.

Ars Technica

Show thread

Reed Mideke Feb 28, 2024

So my take on this is Wendy's execs decided "we need an #AI strategy!" and for reasons that remain unclear, it was somehow not immediately shot down with "Sir, this is a Wendy's, we make burgers we don't need a fuckin AI strategy"
https://www.theguardian.com/food/2024/feb/27/wendys-dynamic-surge-pricing

How much is that Frosty? Wendy’s to trial Uber-like surge pricing

Fast-food chain’s CEO announced the plan – which will utilize ‘AI-enabled menu changes’ and suggestive selling – in an earnings call

The Guardian

Show thread

Reed Mideke Mar 2, 2024

"Amazon has sought to stem the tide [of #AI generated schlock books] by limiting self-publishers to three books per day" - Bruh, I know you don't want to deny the starving author toiling away on the next Great American Novel but I think we can set the bar a bit higher than that

https://wapo.st/3UVeYdR

#AIIsGoingGreat #GiftArticle #GiftLink

Tech writer Kara Swisher has a new book. Enter the AI-generated scams.

On Amazon, new books such as Swisher’s memoir now routinely vie with imitators in search results. Some authors are fed up.

The Washington Post

Show thread

Reed Mideke Mar 2, 2024

Like start with an initial limit of one per week and have some kind of reputation threshold. If real people keep coming back to buy your dinosaur erotica or whatever, great, cap lifted, crank out as many as you can, but if you get caught impersonating or listing complete garbage, your account is nuked and you start over

Yeah, there'd be problems with straw buyers and review bombing competitors but it seems like the bar wouldn't have to be very high to make the absolute crap unprofitable

Show thread

Reed Mideke Mar 6, 2024

Inventor of bed shitting machine shocked to discover mountain of turds in own bed https://arstechnica.com/gadgets/2024/03/google-wants-to-close-pandoras-box-fight-ai-powered-search-spam/

#AIIsGoingGreat

Google now wants to limit the AI-powered search spam it helped create

Ranking update targets sites "created for search engines instead of people."

Ars Technica

Show thread

Reed Mideke Mar 7, 2024

WaPo has some great reporters covering the #AI beat. They also inexplicably pay Josh Tyrangiel to vomit up idiotic drivel like this

(it's also amusing they use javascript when they A/B test headlines, so sometimes it switches between the first and second one)

https://www.washingtonpost.com/opinions/2024/03/06/artificial-intelligence-state-of-the-union/

Let AI remake the whole U.S. government (oh, and save the country)

Thanks to AI, Operation Warp Speed was a rare triumph for our federal bureaucracy. Now, it can help us blaze a new path to the shining city on a hill.

The Washington Post

Show thread

Reed Mideke Mar 7, 2024

I ain't gonna waste a gift article on that shit unless someone REALLY wants it but here's a taste after you get past the Palantir hagiography "LLMs can provide better service and responsiveness for many day-to-day interactions between citizens and various agencies. They’re not just cheaper, they’re also faster, and, when trained right, less prone to error or misinterpretation"

Show thread

Reed Mideke Mar 7, 2024

"Some teachers are now using ChatGPT to grade papers"

Seems like fairness would require also allowing them to grade using a ouija board or goat entrails

https://arstechnica.com/information-technology/2024/03/some-teachers-are-now-using-chatgpt-to-grade-papers/

Some teachers are now using ChatGPT to grade papers

New AI tools aim to help with grading, lesson plans—but may have serious drawbacks.

Ars Technica

Show thread

Reed Mideke Mar 15, 2024

Today's #AIIsGoingGreat (HT @ct_bergstrom): Nothing to see here, just a paper in a medical journal which says "In summary, the management of bilateral iatrogenic I'm very sorry, but I don't have access to real-time information or patient-specific data, as I am an AI language model"

https://www.sciencedirect.com/science/article/pii/S1930043324001298

#AI #LLM

Show thread

Reed Mideke Mar 18, 2024

Today's #AIIsGoingGreat continues on the theme of the previous one (via https://twitter.com/wyatt_privilege/status/1769541081006244102)

https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=%22as+an+AI+language+model%22+-LLM+-chatGPT+-artificial&btnG=

warrior cop (@wyatt_privilege) on X

lol https://t.co/8K84UamNGa

X (formerly Twitter)

Show thread

Reed Mideke Mar 18, 2024

Should we expect better from a platform previously noted for indexing lunch menus? ¯\_(ツ)_/¯ https://twitter.com/reedmideke/status/1252450342316339207

Reed Mideke (@reedmideke) on X

@AlexanderRKlotz @seanmcarroll Salad et al picked up 5 citations for this one

X (formerly Twitter)

Show thread

Reed Mideke Mar 19, 2024

Another day, another credulous #AI boosting WaPo opinion piece

"AI could narrow the opportunity gap by helping lower-ranked workers take on decision-making tasks currently reserved for the dominant credentialed elites … Generative AI could take this further, allowing nurses and medical technicians to diagnose, prescribe courses of treatment and channel patients to specialized care"

[citation fucking needed]

https://www.washingtonpost.com/opinions/2024/03/19/artificial-intelligence-workers-regulation-musk/

AI could help ending the dominance of the credentialed classes

The idea that technology inevitably transforms society for the worse stems from a misunderstanding about how technologies become embedded in an economy.

The Washington Post

Show thread

Reed Mideke Mar 19, 2024

The premise is bizarre. What exactly are the non-experts doing when they "take on decision-making tasks" in this scenario? One of the big problems with current #LLM "AI" is you need subject matter expertise to tell when they are bullshitting…

Show thread

Reed Mideke Mar 20, 2024

Somewhat surprised Cohen's #ChatGPTLawyer escapade didn't result in sanctions for him or his lawyers, though they do seem avoided the sort of cover-up attempts that doomed some of the others
https://arstechnica.com/tech-policy/2024/03/michael-cohen-and-lawyer-avoid-sanctions-for-citing-fake-cases-invented-by-ai/?utm_brand=arstechnica&utm_social-type=owned&utm_source=mastodon&utm_medium=social

Michael Cohen loses court motion after lawyer cited AI-invented cases

No punishment, but judge rejects Cohen motion to end his supervised release.

Ars Technica

Show thread

Reed Mideke Mar 22, 2024

Epic Zitron rant "Sam Altman desperately needs you to believe that generative AI will be essential, inevitable and intractable, because if you don't, you'll suddenly realize that trillions of dollars of market capitalization and revenue are being blown on something remarkably mediocre" https://www.wheresyoured.at/peakai/

Have We Reached Peak AI?

Last week, the Wall Street Journal published a 10-minute-long interview with OpenAI CTO Mira Murati, with journalist Joanna Stern asking a series of thoughtful yet straightforward questions that Murati failed to satisfactorily answer. When asked about what data was used to train Sora, OpenAI's app for generating video with AI,

Ed Zitron's Where's Your Ed At

Show thread

Reed Mideke Mar 29, 2024

Can we fucking not? "In a 2019 War on the Rocks article, “America Needs a ‘Dead Hand’,” we proposed the development of an artificial intelligence-enabled nuclear command, control, and communications system to partially address this concern… We can only conclude that America needs a dead hand system more than ever" https://warontherocks.com/2024/03/america-needs-a-dead-hand-more-than-ever/

America Needs a Dead Hand More than Ever - War on the Rocks

In the minutes after a launch detection or nuclear detonation, would America’s nuclear command, control, and communications system enable the president to

War on the Rocks

Show thread

Reed Mideke Mar 29, 2024

The authors offer a lot of vague-to-meaningless handwaving "All forms of artificial intelligence are premised on mathematical algorithms, which are defined as “a set of instructions to be followed in calculations or other operations.” Essentially, algorithms are programming that tells the model how to learn on its own"

Uh… OK?

Show thread

Reed Mideke Mar 29, 2024

"America is no stranger to “fail-fatal” systems either"

Uh yeah, but *some* of us poor simple minded bleeding heart peaceniks may consider "fail-fatal for the entire fucking planet" to be entirely different class of system which raises some unique concerns

Show thread

Reed Mideke Mar 29, 2024

"Keep in mind, where artificial intelligence tools are embedded in a specific system, each function is performed by multiple algorithms of differing design that must all agree on their assessment for the data to be transmitted forward. If there is disagreement, human interaction is required"
Well as long as long as both ChatGPT *and* Claude have to sign off on the global thermonuclear war, it's hard to see how anything could go wrong

Show thread

Reed Mideke Mar 29, 2024

I don't think these guys have much chance of gaining traction in the US, but it would be unfortunate if other nuclear states decided they were at risk of an AI dead hand gap

Show thread

Reed Mideke Mar 30, 2024

Today's #AIIsGoingGreat brought to you by #NYC, who deployed spicy autocomplete to provide advice "on topics such as compliance with codes and regulations, available business incentives, and best practices to avoid violations and fines"

(spoiler: one great way to avoid violations and fines is to not get your legal advice from spicy autocomplete)

https://themarkup.org/news/2024/03/29/nycs-ai-chatbot-tells-businesses-to-break-the-law

NYC’s AI Chatbot Tells Businesses to Break the Law – The Markup

The Microsoft-powered bot says bosses can take workers’ tips and that landlords can discriminate based on source of income

Show thread

Reed Mideke Mar 30, 2024

One potentially informative thing reporters following up on that #NYC #AI #Chatbot story could do is #FOIA (or whatever the NY equivalent is) communications related to the acquisition and deployment. Who pushed for this in the first place? What did #Microsoft promise? What sort of quality / acceptance testing was done? Did anyone, anywhere along the line raise concerns that it would give out bad, potentially illegal advice?

Show thread

Reed Mideke Mar 30, 2024

I'd be pretty surprised if there isn't an email chain somewhere with a technical person going "WTF are you even thinking"

Show thread

Reed Mideke Mar 30, 2024

Bonus #AIIsGoingGreat "Your phone now needs more than 8 GB of RAM to run autocomplete" (and presumably, battery cost somewhat on a par with heavy GPU rendering) https://arstechnica.com/gadgets/2024/03/google-says-the-pixel-8-will-get-its-new-ai-model-but-ram-usage-is-a-concern/

Google says running AI models on phones is a huge RAM hog

Google wants AI models to be loaded 24/7, so 8GB of RAM might not be enough.

Ars Technica

Show thread

Reed Mideke Apr 1, 2024

Today's #AIIsGoingGreat (HT @pluralistic https://mastodon.social/@pluralistic@mamot.fr/112196496077034192)

Tired: Typo squatting
Wired: Hallucination squatting

https://www.theregister.com/2024/03/28/ai_bots_hallucinate_software_packages/

#AI #LLM

AI hallucinates software packages and devs download them – even if potentially poisoned with malware

Simply look out for libraries imagined by ML and make them real, with actual malicious code. No wait, don't do that

The Register

Show thread

Reed Mideke Apr 2, 2024

Seems like you could put your thumb on scale for which (non existent) libraries show up with #LLM training set poisoning attacks (previously https://mastodon.social/@reedmideke/110850376856613599)

Set up a site that, when it detects known AI scrapers, serves up code or documentation that references a non-existent library, along text associating with whatever kind of code and industry you want to target

OTOH, this would leave much more of trail than just observing bogus ones that show up naturally

Show thread

Reed Mideke Apr 2, 2024

In which the gang discovers Amazon Fresh "Just walk out" checkout was powered by Type II #AI https://gizmodo.com/amazon-reportedly-ditches-just-walk-out-grocery-stores-1851381116

Amazon Ditches 'Just Walk Out' Checkouts at Its Grocery Stores

Amazon Fresh is moving away from a feature of its grocery stores where customers could skip checkout altogether.

Gizmodo

Show thread

Reed Mideke

"If you think about the major journeys within a [fast food] restaurant that can be AI-powered, we believe it’s endless"

Sir this a fucking Wendy's and people come here to buy a fucking burger, not "take major journeys" https://arstechnica.com/information-technology/2024/04/ai-hype-invades-taco-bell-and-pizza-hut/

AI hype invades Taco Bell and Pizza Hut

Everything is suddenly "AI" in corporate food marketing, and we may have hit peak buzz.

Ars Technica

Show thread

Reed Mideke Apr 4, 2024

Also uh, can't imagine anything that could possibly go wrong with this: "This enhancement would allow team members to ask the [AI chatbot] app questions like "How should I set this oven temperature?" directly instead of asking a human being"

Show thread

Reed Mideke Apr 5, 2024

Some scientists theorized that after over 30 years of continuous development, it was physically impossible to make Adobe Reader worse, but once again, Adobe engineers have found a way

Show thread

Reed Mideke Apr 5, 2024

I actually kinda wanted to see it summarize the spurious scholar (https://tylervigen.com/spurious-scholar) paper I was reading when it popped up, but… not enough to log in

Spurious Scholar

Spurious research papers based on real correlations with p < 0.05, generated by a large language model.

Show thread

Reed Mideke Apr 5, 2024

Today's #AIIsGoingGreat brought to you by #Ivanti: 'Among the details is the company's promise to improve search abilities in Ivanti's security resources and documentation portal, "powered by AI," and an "Interactive Voice Response system" … also "AI-powered"'

Ah yes, hard to think of any better way to fix a pattern of catastrophic security failures than *checks notes* filtering highly technical, security critical information through a hyper-confident BS machine

https://arstechnica.com/security/2024/04/ivanti-following-years-of-critical-vpn-exploits-pledges-new-era-of-security/

Ivanti CEO pledges to “fundamentally transform” its hard-hit security model

Part of the reset involves AI-powered documentation search and call routing.

Ars Technica

Show thread

Reed Mideke Apr 6, 2024

Today's #AIIsGoingGreat brought to you by the artist formerly known as Twitter https://mashable.com/article/elon-musk-x-twitter-ai-chatbot-grok-fake-news-trending-explore

X's AI chatbot Grok made up a fake trending headline about Iran attacking Israel

The AI-generated false headline was promoted by X in its official trending news section.

Mashable

Show thread

Reed Mideke Apr 10, 2024

Here's a helpful #AI chatbot to assist you with thing that requires domain specific knowledge and has significant real-world consequences for errors… oh, by the way, you'll need to already have that same domain specific knowledge to confirm whether the answers are correct or complete BS

Who thinks this is a good idea?🤔

#AIIsGoingGreat

Show thread

Reed Mideke Apr 10, 2024

Texas Education Agency talks a lot about the supposed safeguards in the don't-call-it-#AI "automated scoring engine" but no mention of any testing to determine whether it is fit for purpose (they do mention training it on 3K manually scored questions). Maybe they did and it just didn't get mentioned, but seems like a very good #FOIA target
https://www.texastribune.org/2024/04/09/staar-artificial-intelligence-computer-grading-texas/

Texas will use computers to grade written answers on this year’s STAAR tests

The state will save more than $15 million by using technology similar to ChatGPT to give initial scores, reducing the number of human graders needed. The decision caught some educators by surprise.

The Texas Tribune

Show thread

Reed Mideke Apr 30, 2024

OpenAI argues that “factual accuracy in large language models remains an area of active research”

…in the sense that Bigfoot and Nessie remain areas of active research?

https://noyb.eu/en/chatgpt-provides-false-information-about-people-and-openai-cant-correct-it

ChatGPT provides false information about people, and OpenAI can’t correct it

noyb today filed a complaint against the ChatGPT maker OpenAI with the Austrian DPA

noyb.eu

Show thread

Reed Mideke May 8, 2024

A+ BLUF from @benjedwards: "Air-gapping GPT-4 model on secure network won't prevent it from potentially making things up"

https://arstechnica.com/information-technology/2024/05/microsoft-launches-ai-chatbot-for-spies/

Microsoft launches AI chatbot for spies

Air-gapping GPT-4 model on secure network won't prevent it from potentially making things up.

Ars Technica

Show thread

Reed Mideke May 9, 2024

Oh hey, remember #AdVon, the definitely-not-an-ai-company caught publishing #AI dreck in Sports Illustrated? (previously https://mastodon.social/@reedmideke/111486230567895424)
Futurism has another update, and it's a doozy
https://futurism.com/advon-ai-content

Meet AdVon, the AI-Powered Content Monster Infecting the Media Industry

Our investigation into AdVon Commerce, the AI contractor at the heart of scandals at USA Today and Sports Illustrated.

Futurism

Show thread

Reed Mideke May 24, 2024

Google+ comparison is very apt, but also that opening example really hits the problem I've been yelling about since the #LLM hype cycle started: The fundamental mismatch between a system that randomly makes shit up and the uses it's being hyped for https://www.computerworld.com/article/2117752/google-gemini-ai.html

Gemini is the new Google+

Google's cutting-edge AI technology has a familiar connection to the past — and in this case, that isn't a good thing.

Computerworld

Show thread

Reed Mideke May 24, 2024

This, right here: "Erm, right. So you can rely on these systems for information - but then you need to go search somewhere else and see if they’re making something up? In that case, wouldn’t it be faster and more effective to, I don’t know, simply look it up yourself in the first place?"

Show thread

Reed Mideke May 25, 2024

Google's current #AIIsGoingGreat moment really checks all the bad #AI boxes. Starting with the dismissive "examples we've seen are generally very uncommon queries and aren’t representative of most people’s experiences" - Sure *sometimes* the answers are complete BS and possibly dangerous, but what about the times they aren't? Checkmate, Luddites!

https://arstechnica.com/information-technology/2024/05/googles-ai-overview-can-give-false-misleading-and-dangerous-answers/?utm_brand=arstechnica&utm_social-type=owned&utm_source=mastodon&utm_medium=social

Google’s “AI Overview” can give false, misleading, and dangerous answers

From glue-on-pizza recipes to recommending "blinker fluid," Google's AI sourcing needs work.

Ars Technica

Show thread

Reed Mideke May 25, 2024

And as always, they insist they are fixing it: "We conducted extensive testing before launching this new experience and will use these isolated examples as we continue to refine our systems overall" - with *zero* indication they have a technical or even theoretical path to solving the general problem that #LLMs don't have any concept of truth

Show thread

Reed Mideke May 25, 2024

And then, the whole thing is made worse by positioning it as a replacement for search, in the top spot with google branding. The "eat rocks" article ranks high in the regular organic search results the same query, but users have a lot more clues that it was a joke

Show thread

Reed Mideke May 25, 2024

Why am I so sure #AI companies have no serious technical or theoretical solution to the underlying problem that #LLMs have no concept of truth? The fact their approach so far is manually band-aiding results that go viral or put them in legal jeopardy is a pretty big hint! https://www.theverge.com/2024/5/24/24164119/google-ai-overview-mistakes-search-race-openai

Google scrambles to manually remove weird AI answers in search

The company confirmed it is ‘taking swift action’ to remove some of the AI tool’s bizarre responses.

The Verge

Show thread

Reed Mideke May 25, 2024

👉 "Gary Marcus, an AI expert and an emeritus professor of neural science at New York University, told The Verge that a lot of AI companies are “selling dreams” that this tech will go from 80 percent correct to 100 percent. Achieving the initial 80 percent is relatively straightforward since it involves approximating a large amount of human data, Marcus said, but the final 20 percent is extremely challenging. In fact, Marcus thinks that last 20 percent might be the hardest thing of all"

Show thread

Reed Mideke May 25, 2024

What they're doing now seems like selling a calculator, and when a screenshot of it saying 2+2=5 goes viral on social media, they add a statement like "if x=2 and y=2 return 4" at the top of the program and say "see, we fixed it!"

Show thread

Reed Mideke May 25, 2024

Straight from Google CEO Sundar Pichai's mouth: 'these "hallucinations" are an "inherent feature" of AI large language models (LLM), which is what drives AI Overviews, and this feature "is still an unsolved problem"'

but they're gonna keep band-aiding until it's good, promise! ""Are we making progress? Yes, we are … We have definitely made progress when we look at metrics on factuality year on year. We are all making it better, but it’s not solved""

https://futurism.com/the-byte/ceo-google-ai-hallucinations

#AIIsGoingGreat

CEO of Google Says It Has No Solution for Its AI Providing Wildly Incorrect Information

Google CEO Sundar Pichai says problems with its AI can't be solved because hallucinations are an inherent problem in these AI tools.

Futurism

Show thread

Reed Mideke May 29, 2024

Just occurred to me Mitchell and Webb predicted our current pizza-gluing, gasoline spaghetti #AIIsGoingGreat moment 16 years ago https://www.youtube.com/watch?v=B_m17HK97M8

Mitchell & Webb: Cheesoid

YouTube