Excellent article:

"The lesson of OpenClaw isn't that we should stop using AI. It’s that we need to stop using it as a lubricant. In a world where everyone has a gas pedal, the most valuable competitive advantage is having a far better set of brakes."

https://hellotimking.com/brute-force-shipping-blind-code/?AIagents.at

#ai #vibecode #openclaw #ai_fail

Brute-Force Shipping Blind Code

Why your frictionless AI developer gas-pedal needs a better set of brakes.

The Rebel King | Tech, Culture, Work, and Writing

Ich kann mir vorstellen, dass #KI in speziellen Use Cases Anwendung findet. Bei der Analyse definierter Datensätze zum Beispiel. Oder unterstützend in Forschung und Entwicklung.

Mit dem Gedanken im Hinterkopf habe #KI in Form von #Copilot gestern mal wieder eine Chance gegeben (mein AG setzt auf #Microsoft). Ich habe einen Prozess auf einem Flipchart skizziert, die Skizze fotografiert und Copilot sollte das dann in #Powerpoint darstellen. Copilot hat sich vorher selbst damit beworben, dass es das ja sogar im Coporate Design könne, usw.

Ergebnis: die Hälfte hat gefehlt, jedes Dritte Wort war falsch geschrieben, es gab keine Verbindungen und unser Corporate Design hat Copilot wohl nicht gefunden.
Vielleich probier ich es in einem halben Jahr wieder 🤣
#ai #ai_fail

ach @altbot so viel AI sollte eigentlich da sein, dass dieses Bild als Meme erkannt wird. Das Bild steht für "Nein! Doch! Oh!" von Louis de Funes, eine Bildsuche zu dem Begriff findet als die ersten drei Bilder genau dieses. Ein riesig langer Text, der die Schauspieler nicht mal identifiziert, ist kaum tauglich, die Botschaft zu transportieren. (hmm, ich kann den Betreiber micr0 nicht taggen, werden Kritiker geblockt?) #altbot #ai_fail
@abulling

#aifail #ai_fail
Můj první pokus s AI asistentem při #programování. Jen tak ze srandy. A sranda to byla, protože je to blbost.

já: jak udělat to a to?
AI: použij tuhle funkci (pošle kus zdrojáku s chybou)
já: to způsobí error, proč?
AI: protože předáváš string, ale funkce očekává objekt, ale můžeš použít tuhle funkci, ta akceptuje objekt i string
já: to není pravda, tahle funkce taky akceptuje jen objekt
AI: máš pravdu, funkce akceptuje taky jen objekt, v tom případě to nemá řešení

😆 🤦

@boboseb

Or n'a pas fini de se poiler. 🤣

#AI #AI_fail #AI_fails

'Friend creator Avi Schiffmann said in a blog post that the device is an "expression of how lonely" he's felt.'

Oh boy.

https://www.macrumors.com/2024/07/30/friend-necklace-ai-companionship/

#tech #ai #ai_fail

New 'Friend' Necklace Offers AI Companionship

AI wearables like the Rabbit R1 and the AI Pin have attempted to capitalize on the popularity of artificial intelligence and have largely flopped,...

MacRumors

If you are interested in LLM reasoning capabilities, this might be something for you:
Nezhurina et al, "Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models"

paper: https://arxiv.org/abs/2406.02061
GitHub: https://github.com/LAION-AI/AIW

#llms #reasoning #ai_fail #ai #AIResearch

Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models

Large Language Models (LLMs) are often described as instances of foundation models that possess strong generalization obeying scaling laws, and therefore transfer robustly across various conditions in few- or zero-shot manner. Such claims rely on standardized benchmarks that suppose to measure generalization and reasoning, where state-of-the-art (SOTA) models score high. We demonstrate here a dramatic breakdown of generalization and basic reasoning of all SOTA models claiming strong function, including large scale advanced models like GPT-4 or Claude 3 Opus, using a simple, short common sense math problem formulated in concise natural language, easily solvable by humans (AIW problem). The breakdown is dramatic as it manifests on a simple problem in both low average performance and strong performance fluctuations on natural variations in problem template that do not change either problem structure or its difficulty at all. By testing models on further control problems with similar form, we rule out that breakdown might be rooted in minor low-level issues like natural language or numbers parsing. We also observe strong overconfidence in the wrong solutions, expressed in form of plausible sounding explanation-like confabulations. Various standard interventions in an attempt to get the right solution, like chain-of-thought prompting, or urging the models to reconsider the wrong solutions again by multi step re-evaluation, fail. We use these observations to stimulate re-assessment of the capabilities of current generation of LLMs as claimed by standardized benchmarks. Such re-assessment also requires common action to create standardized benchmarks that would allow proper detection of such deficits in generalization and reasoning that obviously remain undiscovered by current state-of-the-art evaluation procedures, where SOTA LLMs manage to score high. Code: https://github.com/LAION-AI/AIW

arXiv.org
Bit disappointed to see rnz.co.nz using what appears to be AI generated images in their recent story on LSD.
Students of chemistry will recognise Hydrogen is capable of forming a single covalent bond, while Nitrogen can form 3.
Compare the Wikipedia entry showing the LSD molecule with what I can only presume is AI-generated junk in the RNZ article. When journalists put out this kind of bunkum it does nothing for respect of their work.
#AI_fail #RNZ #LSD
https://www.rnz.co.nz/news/national/522324/nz-trial-explores-combining-lsd-with-therapy
NZ trial explores combining LSD with therapy

A study is looking at the clinical possibilities of combining micro doses of LSD with talk therapy for cancer patients.

RNZ