My conclusions of watching most of the chess matches between current LLM on the kaggle games arena (like this one: https://youtu.be/RALDtg1hSTQ?si=fykXVsJ9vzalm7ui )

- if you an AI hater, watch these. Never before have I seen a similar combination of hubris and hallucination. A typical thinking process goes like „<move> is the natural forcing move, disabling black‘s counterplay. After <impossible move> I counter with the powerful <move of a piece that is not there> and I‘m in a strong winning position“. Particularly #deepseek R1 was, well, full of it.

- I was amazed that an unmodified LLM, particularly #gemini25pro and #chatgpto3, can play beginner-level chess - nearly no illegal moves, grasping fundaments of chess, sometimes blundering pieces, sometimes making actual cool combinations (e.g. preparing a fork, preparing a mate in the next move).

- #Grok first destroyed a hapless component but later showed that is not on par with OpenAI‘s and Google‘s LLM.

And finally, kudos to Google for the idea - games are a great way to see how successful AIs can reason. Looking forward to werewolf and poker.

(And of course I would like to see #gpt5 there)

Game Arena: Gemini 2.5 Pro vs o4-mini (3rd place) | Kaggle

YouTube

Datengrundlage > 2014-2025 | alle Anträge & Parteitage

PDF-Staub.
Python/KI-Tagging + Häufigkeitsanalyse = Ranking ohne Bauchgefühl.
#ChatGPTo3-pro

@spd @spdberlin #spd #spdberlin #spd-berlin

https://dahrendorf-signal.de

dahrendorfSignal | Substack

Ich analysiere Anträge und Parteitage mit KI. You get signal, not noise.

🫸🤖 New research tests on various models find that the #ChatGPTo3 model resists shutdown despite explicit instructions.

Read: https://hackread.com/chatgpt-o3-resists-shutdown-instructions-study/

#AISecurity #ChatGPT #OpenAI #AI #ArtificialInteligence

ChatGPT o3 Resists Shutdown Despite Instructions, Study Claims

Follow us on Bluesky, Twitter (X), Mastodon and Facebook at @Hackread

Hackread - Latest Cybersecurity, Hacking News, Tech, AI & Crypto