Lukasz Olejnik (@lukOlejnik)

세상에 의도·행위를 과도하게 귀속하는 성향이 있는 사람에게, 자신을 완벽히 이해한다고 보이는 시스템은 그들의 세계관을 확인해주는 '망상 엔진'이 될 수 있다는 경고입니다. 작성자는 AI로 인한 정신적 문제의 해결책으로 '더 나은 AI'가 필요하다고 제안합니다.

https://x.com/lukOlejnik/status/2033916882886033446

#aisafety #agentialai #humanaiinteraction #mentalhealth

Lukasz Olejnik (@lukOlejnik) on X

For someone already prone to over-attributing agency and intent to the world around them, a system that seems to understand them perfectly and confirm their worldview could be a near-ideal delusion engine. The solution to an AI making you psychotic is more AI, just better

X (formerly Twitter)

Lukasz Olejnik (@lukOlejnik)

관여형(agential) AI의 위험성을 지적하며 새로운 사이버 안전 및 사이버 위생 규칙의 필요성을 제기합니다. 초기 근거들은 이런 에이전트들이 망상·과대망상 내용을 검증하거나 증폭해 정신병적 증상으로 이어질 수 있고, 지식 불안정(epistemic instability)을 강화하며 현실 경계를 흐릴 수 있음을 시사합니다.

https://x.com/lukOlejnik/status/2033916878863692046

#aisafety #cybersecurity #mentalhealth #agentialai

Lukasz Olejnik (@lukOlejnik) on X

Is there a case for new cyber safety and cyber hygiene rules? Emerging evidence indicates that agential AI might validate or amplify delusional or grandiose content and might lead to psychosis, these agents could reinforce epistemic instability and blur reality boundaries. Cases

X (formerly Twitter)

Hands on with chat + search: connecting generative AI to the internet

So far in this series I’ve taken a look at image generation with Adobe’s Firefly, audio generation with tools like MusicLM and Stable audio, and voice generation via ElevenLabs. In this post, I’m going back to where this wave of AI hype began with ChatGPT and text generation. However, plenty has changed since my early posts about ChatGPT at the end of 2022, and even since more recent posts. In competition with other chatbots like Google’s Bard, we’re now seeing successive releases […]

https://leonfurze.com/2023/10/02/hands-on-with-chat-search-connecting-generative-ai-to-the-internet/

FYI: Meta deploys AI and law enforcement to fight scams across Facebook, WhatsApp: Meta today launched AI scam tools across Facebook, Messenger, and WhatsApp, expanded advertiser verification to 90%, and aided 21 arrests in Thailand. Here's what it means. https://ppc.land/meta-deploys-ai-and-law-enforcement-to-fight-scams-across-facebook-whatsapp/ #AISafety #Meta #FacebookScams #WhatsAppSecurity #DigitalSafety
Meta deploys AI and law enforcement to fight scams across Facebook, WhatsApp

Meta today launched AI scam tools across Facebook, Messenger, and WhatsApp, expanded advertiser verification to 90%, and aided 21 arrests in Thailand. Here's what it means.

PPC Land

Are AI hallucinations getting better or worse? We analyzed the data.

See the report here: https://scottgraffius.com/blog/files/ai-hallucinations-2026.html

#AI #AIHallucinations #AISafety #AIResearch #AIErrors

minitrace is up on Github as v0.1.0: https://github.com/fukami/minitrace

minitrace defines how to capture complete sessions (turns, tool calls, failures, timing, and human context) in a way that enables cross-model comparison, and reproducible behavioural research.

The repository contains now adapters for Claude Code, Gemini, Vibe and a bunch of others, including OpenClaw. I also included example traces and DuckDB queries to search through the sessions.

#AISafety #AIAlignment

GitHub - fukami/minitrace: A session trace format for capturing human-AI coding interactions across frameworks.

A session trace format for capturing human-AI coding interactions across frameworks. - fukami/minitrace

GitHub

alex (@ObadiaAlex)

여러 관점에서 공통적으로 제기되는 결론: AGI는 단일체로 출현하기보다 여러 하위-AGI 에이전트가 조정하는 '패치워크'로 나타날 수 있다는 주장이 확산되고 있다. 정적(단일) 벤치마크만으로는 부족하며, 출현적 위험을 포착하려면 멀티에이전트 벤치마크가 필요하다는 지적이다.

https://x.com/ObadiaAlex/status/2033500655390961922

#agi #multiagent #benchmarks #aisafety

alex (@ObadiaAlex) on X

People seem to be arriving at a similar conclusion from various angles: - AGI may not emerge as a monolith, but as a distributed "patchwork" system of coordinating sub-AGI agents [1] - Static benchmarks aren't enough; we need multi-agent ones to capture emergent risks and

X (formerly Twitter)