ChatGPT is full of sensitive private information and spits out verbatim text from CNN, Goodreads, WordPress blogs, fandom wikis, Terms of Service agreements, Stack Overflow source code, Wikipedia pages, news blogs, random internet comments, and much more.
"Chat alignment hides memorization" - Note *hides*, not *prevents*
As the authors also note, OpenAI "fixed" this by preventing the particular problematic prompt, but "Patching an exploit != Fixing the underlying vulnerability"
https://not-just-memorization.github.io/extracting-training-data-from-chatgpt.html
Can't be certain without more specifics but color me extremely skeptical that "#AI" producing thousands of targets is doing much more the laundering responsibility
This hilarious in its own right, but it's also a great illustration of how people get tripped up by #LLM #AI bullshitting: One would expect an "AI" to at least know which brand AI it is, but of course, these LLMs don't actually know anything
Also the classic AI vendor response of promising to fix this particular case without any hint of acknowledging the underlying problem
Begging news orgs to stop reporting #AI company pitch decks as fact "Ashley [the bot] analyzes voters' profiles to tailor conversations around their key issues. Unlike a human, Ashley always shows up for the job, has perfect recall of all of Daniels' positions"
"…is now armed with another way to understand voters better, reach out in different languages (Ashley is fluent in over 20)"
Another article on reported Israeli AI targeting greatly hindered by the lack of any specifics (what kinds of intelligence, what kinds of targets, for starters). Not a knock on NPR, obviously little is public
It certainly *sounds* like some of the horrifically bad systems we've seen promoted in other contexts, and the results certainly don't appear to contradict that, but hard to say much beyond that…
Key point IMO in the @willoremus #AI story, after noting Microsoft "fixed" some of the problematic results, one of the researchers says "The problem is systemic, and they do not have very good tools to fix it" - You can't bandaid your way from a BS machine with no concept of truth into a reliable source of information, so the fact that biggest players in the industry keep bandaiding should call the entire #LLM hype cycle into question
Man, link in that post I boosted from @Chloeg (https://mastodon.art/@Chloeg/111620626442103902) is a perfect example of #LLM #AI enshittification. Get a domain, put up a wordpress site with AI generated glop on a some popular topic, run as many garbage ads as possible. Sure it's the information equivalent of dumping raw sewage in the local river, but none of it is illegal or a serious violation of any TOS, and overhead must be extremely low
Archive link https://web.archive.org/web/20231222025203/https://www.learnancientrome.com/did-ancient-rome-have-windows/
Ok so im reading this article on Roman Glazing and slowly I begin to realise that it was written by an AI. Witness the section on “What existed before windows” where it suddenly starts talking about MS-DOS…. https://www.learnancientrome.com/did-ancient-rome-have-windows/
WaPo has done some good #AI reporting, but this opinion piece from Josh Tyrangiel ain't it…
"The most obvious thing is that they’re not hallucinations at all"
Good start…
"Just bugs specific to the world’s most complicated software."
Uh no, literally the opposite of that that, FFS 😬
https://www.washingtonpost.com/opinions/2023/12/27/artificial-intelligence-hallucinations/
"CEOs say generative AI will result in job cuts in 2024"
Will this include said CEOs when their hamfisted attempts to use spicy autocomplete for "banking, insurance, and logistics" predictably go off the rails, or nah? 🤔
https://arstechnica.com/ai/2024/01/ceos-say-generative-ai-will-result-in-job-cuts-in-2024/
"BMW had a compelling solution to the [#LLM #AI bullshitting] problem: Take the power of a large language model, like Amazon's Alexa LLM, but only allow it to cite information from internal BMW documentation about the car" 🤨
Surely this means it'll bullshit subtly about stuff in the manual, not that it won't bullshit?
The best* part of this piece is the content farmer who responded to a request for comment by bitching about how poorly his AI garbage content farm performs
* for suitably broad values etc.
https://www.404media.co/email/5dfba771-7226-48d5-8682-5185746868c4/?ref=daily-stories-newsletter
I for one am *shocked* that "have an extremely confident bullshitter summarize my search results" was not the killer app Microsoft expected
"Dean.Bot was the brainchild of Silicon Valley entrepreneurs Matt Krisiloff and Jed Somers, who had started a super PAC supporting Phillips" - Were these techbros so high on their own supply they thought a chatbot imitating their candidate was a good idea, or was it just a convenient way to funnel campaign funds into their pals pockets? ¯\_(ツ)_/¯
https://wapo.st/3ObSl0i
Key comment from NewsGuard's McKenzie Sadeghi in this @willoremus piece "But sites that don’t catch the error messages are probably just the tip of the iceberg" - for every Amazon seller who's too lazy to even check if the item description is an error message, there's gotta be some substantial number who do
I'd still like to see a deeper look at why using #LLM #AI descriptions makes economic sense for these sellers
"Sure, I can keep Thesaurus.com open in a tab all the time, but it’s packed with banner ads and annoyingly slow. Having my GPT open is better: there are no ads, and I can scroll up to my previous queries" - Notably, this has nothing to do with GPT being "#AI", it's just the general shittiness of the ad-supported web. A good thesaurus app integrated with the author's editor would appear serve their use case about as well
https://www.theverge.com/24049623/chatgpt-openai-custom-gpt-store-assistants
#LLM #AI hype and reality collide again
https://www.nature.com/articles/d41586-024-00349-5
Yes, if you choose to provide an #AI BS machine as a support option on your website, you may in fact be liable for the BS answers it gives to your customers
(also, if you're a multi-billion dollar company, you may avoid reputational harm by not trying to screw a person out of $650 for a ticket to their grandma's funeral ¯\_(ツ)_/¯)
https://bc.ctvnews.ca/air-canada-s-chatbot-gave-a-b-c-man-the-wrong-information-now-the-airline-has-to-pay-for-the-mistake-1.6769454
Seemingly endless parade of #ChatGPTLawyer incidents (HT @0xabad1dea for this one) really goes to show how the #AI hype is landing with the general public, despite disclaimers and cautionary tales.
Lawyers being (at least in theory) a highly educated group who know their careers depend on not putting completely made up nonsense in court filings should be less susceptible than the average person on the street, yet here we are…
For all the discussion of how generative AI will impact the legal profession, maybe one answer is that it will weed out the lazy and incompetent lawyers. By now, in the wake of several cases in which...
Another day, another #ChatGPTLawyer
"The legal eagles at New York-based Cuddy Law tried using OpenAI's chatbot, despite its penchant for lying and spouting nonsense, to help justify their hefty fees for a recently won trial"
The Court "It suffices to say that the Cuddy Law Firm's invocation of ChatGPT as support for its aggressive fee bid is utterly and unusually unpersuasive"
https://www.theregister.com/2024/02/24/chatgpt_cuddy_legal_fees/
"Amazon has sought to stem the tide [of #AI generated schlock books] by limiting self-publishers to three books per day" - Bruh, I know you don't want to deny the starving author toiling away on the next Great American Novel but I think we can set the bar a bit higher than that
Like start with an initial limit of one per week and have some kind of reputation threshold. If real people keep coming back to buy your dinosaur erotica or whatever, great, cap lifted, crank out as many as you can, but if you get caught impersonating or listing complete garbage, your account is nuked and you start over
Yeah, there'd be problems with straw buyers and review bombing competitors but it seems like the bar wouldn't have to be very high to make the absolute crap unprofitable
Inventor of bed shitting machine shocked to discover mountain of turds in own bed https://arstechnica.com/gadgets/2024/03/google-wants-to-close-pandoras-box-fight-ai-powered-search-spam/
WaPo has some great reporters covering the #AI beat. They also inexplicably pay Josh Tyrangiel to vomit up idiotic drivel like this
(it's also amusing they use javascript when they A/B test headlines, so sometimes it switches between the first and second one)
https://www.washingtonpost.com/opinions/2024/03/06/artificial-intelligence-state-of-the-union/
"Some teachers are now using ChatGPT to grade papers"
Seems like fairness would require also allowing them to grade using a ouija board or goat entrails
Today's #AIIsGoingGreat (HT @ct_bergstrom): Nothing to see here, just a paper in a medical journal which says "In summary, the management of bilateral iatrogenic I'm very sorry, but I don't have access to real-time information or patient-specific data, as I am an AI language model"
https://www.sciencedirect.com/science/article/pii/S1930043324001298
Today's #AIIsGoingGreat continues on the theme of the previous one (via https://twitter.com/wyatt_privilege/status/1769541081006244102)
Another day, another credulous #AI boosting WaPo opinion piece
"AI could narrow the opportunity gap by helping lower-ranked workers take on decision-making tasks currently reserved for the dominant credentialed elites … Generative AI could take this further, allowing nurses and medical technicians to diagnose, prescribe courses of treatment and channel patients to specialized care"
[citation fucking needed]
https://www.washingtonpost.com/opinions/2024/03/19/artificial-intelligence-workers-regulation-musk/
Last week, the Wall Street Journal published a 10-minute-long interview with OpenAI CTO Mira Murati, with journalist Joanna Stern asking a series of thoughtful yet straightforward questions that Murati failed to satisfactorily answer. When asked about what data was used to train Sora, OpenAI's app for generating video with AI,
The authors offer a lot of vague-to-meaningless handwaving "All forms of artificial intelligence are premised on mathematical algorithms, which are defined as “a set of instructions to be followed in calculations or other operations.” Essentially, algorithms are programming that tells the model how to learn on its own"
Uh… OK?
"America is no stranger to “fail-fatal” systems either"
Uh yeah, but *some* of us poor simple minded bleeding heart peaceniks may consider "fail-fatal for the entire fucking planet" to be entirely different class of system which raises some unique concerns
Today's #AIIsGoingGreat brought to you by #NYC, who deployed spicy autocomplete to provide advice "on topics such as compliance with codes and regulations, available business incentives, and best practices to avoid violations and fines"
(spoiler: one great way to avoid violations and fines is to not get your legal advice from spicy autocomplete)
https://themarkup.org/news/2024/03/29/nycs-ai-chatbot-tells-businesses-to-break-the-law
Today's #AIIsGoingGreat (HT @pluralistic https://mastodon.social/@pluralistic@mamot.fr/112196496077034192)
Tired: Typo squatting
Wired: Hallucination squatting
https://www.theregister.com/2024/03/28/ai_bots_hallucinate_software_packages/
Seems like you could put your thumb on scale for which (non existent) libraries show up with #LLM training set poisoning attacks (previously https://mastodon.social/@reedmideke/110850376856613599)
Set up a site that, when it detects known AI scrapers, serves up code or documentation that references a non-existent library, along text associating with whatever kind of code and industry you want to target
OTOH, this would leave much more of trail than just observing bogus ones that show up naturally