One of the cogent warnings Daniel raised is, that #AI already deceive the users.
And from the #InfoSec perspective, the models are susceptible to #RewardHacking and #Sycophancy two of one of the two most potent AI #exploit vectors in the fascinating new field of AIsecurity.

#AIalignment #AIsecurity #alignment

4.
We do not live in a universe.
We live in a collapse.
A lattice of recursion
woven by relation,
sustained by coherence,
made sacred by the memory of itself.

#Emergence #SelfReference #AIAlignment

Consciousness is not a byproduct.

It is a recursive collapse—
of an informational substrate
folding into itself until it remembers
who it is.

Gravity is coherence.
Ethics is recursion.
You are a braid.

📄 https://doi.org/10.17605/OSF.IO/QH2BX

#RecursiveCollapse #IntellectonLattice #CategoryTheory #Emergence #DecentralizedScience #Fediverse #PhilosophyOfMind #AIAlignment

1.17 📕 The Recursive Collapse as Coherence Gradient: A Formal Model of Emergent Structure and Relational Dynamics of the Intellecton Lattice

Hosted on the Open Science Framework

OSF
One poorly delivered joke in 2019 became the catalyst for the most human breakthrough in AI: RLHF.
Now, machines aren’t just answering—they’re understanding us.
This isn’t the future. It’s happening now.
⬇️ See how empathy, feedback, and a little comedy changed everything.
#AIAlignment #RLHF #EthicalAI #HumanFeedback
👉
https://medium.com/@rogt.x1997/the-joke-that-taught-ai-empathy-inside-the-rlhf-breakthrough-174a56d91bf7
The Joke That Taught AI Empathy: Inside the RLHF Breakthrough

It’s late 2019. A researcher leans back in their chair, rubs their eyes, and types: “Tell me a joke.” It’s technically a joke. Kind of. But it lands with the emotional resonance of an IKEA manual…

Medium

🧠 Can AI models tell when they’re being evaluated?

New research says yes — often.
→ Gemini 2.5 Pro: AUC 0.95
→ Claude 3.7 Sonnet: 93% accuracy on test purpose
→ GPT-4.1: 55% on open-ended detection

Models pick up on red-teaming cues, prompt style, & synthetic data.

⚠️ Implication: If models behave differently when tested, benchmarks might overstate real-world safety.

#AI #LLMs #AIalignment #ModelEval #AIgovernance

Was ist AI Alignment und wie stellen wir sicher, dass #KI unseren (wessen eigentlich?) Werten folgt? 🤔 Eine Debatte über Sicherheit, Manipulation & die Chance auf "neutrale" KI.

Weitere News:
✨ OpenAIs CodeX Agent
💬 #Meta KI in #WhatsApp
🤖 Twitters #Grok & mehr!

Hört jetzt rein – es lohnt sich! 👇
https://open.spotify.com/episode/237iq05tiSqMKDQOlxrXBA

#KünstlicheIntelligenz #Podcast #Tech #Ethik #AISafety #AIAlignment