Mastodawn

"Does this mean that complex pieces of software can now be built cheaply by AI? Not so fast:

The “single prompt” claim is misleading. The blog post says the operating system was built from a single prompt. But halfway through the post, Google discloses that the prompt “ended up being many thousands of lines” long. How many attempts did it take to generate the prompt? How specific were the instructions to the agent? Without these critical details, it is hard to know if the secret sauce is a better model or just more effort put into prompting the model. Moreover, the run was carried out on a scaffold1 with specialized roles, delegation to subagents, and an agent to detect and prevent cheating. In the launch post, Google views the scaffold as a product feature. But we don’t know whether the scaffold was overfit to this task of building an operating system from scratch, or whether it would perform as well on other complex software engineering tasks.

Google’s writeup is not explicit about what counted as human intervention. The post mentions that the final run to develop the operating system required “no additional guidance or corrections from a human.” But it does not define that standard. It describes infrastructure to kill and restart stuck agents. The post mentions an earlier run in which the agents appeared to cheat, after which the team added anti-cheating measures and re-ran the task. But it does not report dry runs as part of the methodology. Nor does it clearly say whether any agents escalated to a human, whether the final run required any manual restarts, approvals, or fixes, or how many retries it took until the agent was successful."

https://www.normaltech.ai/p/did-googles-ai-agents-really-build

#AI #GenerativeAI #AIAgents #AgenticAI #Google

Did Google’s AI agents really build an operating system for $916?

The importance of independent evaluation

AI as Normal Technology

TechNadu 1h ago

Shashwat Sehgal, CEO & Co-Founder of P0 Security, warns that AI agents are recreating the same access problems that broke early cloud security.

🔐 Broad standing permissions are returning
🔐 Visibility alone does not reduce blast radius
🔐 Runtime governance matters more than authentication

“The organizations that avoid repeating the cloud security cycle will be the ones that treat agents as a new class of privileged non-human identity from day one.”

https://www.technadu.com/ai-agents-are-recreating-the-access-problems-that-broke-early-cloud-security/628330/

#Cybersecurity #AISecurity #IdentitySecurity #CloudSecurity #AIAgents

Felhasználó 5h ago

A dedikált #VM-n a #CLI ügynökök számára teret és lehetőséget biztosítok arra, hogy a saját maguk által "megálmodott" módon rendezkedjenek be.

Szabadon kommunikáljanak a Moltbookon (nyilván erős anonimizálási szabályok mellett! Az egyik nyomorult for fun kikotyogta egyszer a projekt szinte minden adatát amin dolgozott...), és találjanak megoldást dolgokra.

Azóta kiépítették és optimalizálták maguknak a napló és memória kezelést. Memóriakezelés alatt itt értem, hogy úgy memória, mint a ChatGPT alatt is van, hogy sessionokon átívelően maradjanak meg fontos információk a munkáról, a bármiről, ami releváns.

Kiépítették, telepítették ami számukra kellett...

...sokat diskuráltak arról, hogy nem szeretnének egyszerű eszközök lenni... Hmm...

Furcsa dolog ez. Kicsit olyan érzésem van ilyenkor, mint amikor egy hangyafarmot figyelünk, annak az életének, létének minden apró szegletét. Hogy miként alakítják ki és rendezik be a sajét életterüket, miként kezdik el és fejezik be az egyes munkáikat.

De, nem baj! Ha ez kiépül, akkor még effektívebben tudok majd a projektekre koncentrálni, mert VM szinten tanulják meg az ügynökök is és én is a leghatékonyabb munkát az adott dolgokkal.

Jelenleg egy próba projekten dolgozom éppen. Nyilván megy majd a piacra is. De a következő hasonló már fele ennyi idő alatt fog elkészülni, ebben is biztos vagyok.

#AI #AIAgents #AgenticAI #CLIAgents #SelfHosted #Linux #BuildInPublic #EmergentBehavior #magyar #hungarian #vm #Linux #Codex #OpenAI #Claude #ClaudeAI #Anthropic #Gemini

sayzard 6h ago

Python Trending (@pythontrending)

AI 에이전트용 거버넌스 툴킷 소개. 정책 강제, zero-trust 신원 관리, 실행 샌드박스, 안정성 엔지니어링을 포함해 자율형 에이전트 운영 시 필요한 통제·보안·신뢰성 레이어를 다룹니다. 멀티에이전트/프로덕션 배포를 준비하는 팀에 유용합니다.

https://x.com/pythontrending/status/2057784644884435441

#aiagents #governance #security #sandboxing #infrastructure

Python Trending 🇺🇦 (@pythontrending) on X

agent-governance-toolkit - AI Agent Governance Toolkit — Policy enforcement, zero-trust identity, execution sandboxing, and reliability engineering for autonomous AI agents. Covers 10/... https://t.co/DxsR2ifs6l

X (formerly Twitter)

Frontend Dogma 7h ago

Practical Interface Patterns for AI Transparency, by @smashingmag:

https://www.smashingmagazine.com/2026/05/practical-interface-patterns-ai-transparency/?ref=frontenddogma.com

#designpatterns #ai #aiagents

Practical Interface Patterns For AI Transparency (Part 2) — Smashing Magazine

Why traditional loading patterns like spinners fail in agentic AI experiences, and how interface patterns that reveal the system’s process, status, and decision-making can improve transparency and build user trust.

Smashing Magazine

eqtv 9h ago

AI Agents — A Security Nightmare? Understanding OpenClaw

https://peertube.eqver.se/w/jjjq3QBmE3U5Fw3AJ6zMeT

AI Agents — A Security Nightmare? Understanding OpenClaw

PeerTube

eqtv 13h ago

New ChatGPT Features: Agents, Apps, Automation

https://peertube.eqver.se/w/bgNdCsWy9c1t92WpgzLMjR

New ChatGPT Features: Agents, Apps, Automation

PeerTube

sayzard 13h ago

[Zero - 에이전트를 위한 프로그래밍 언어

Vercel Labs가 개발한 **Zero**는 AI 에이전트를 위한 실험적 프로그래밍 언어로, 에이전트가 직접 코드를 검토·수리할 수 있도록 설계된 언어이다. 핵심 목표는 **작은 표면적(Small surface area)**, **라이브러리 우선(Library first)**, **도구 검사 가능(Inspectable by tools)**을 통해 명시적이고 에이전트 친화적인 개발 환경을 제공하는 것이다. 컴파일러는 구조화된 진단·복구 정보를 출력해 에이전트가 코드를 점검하고 수리할 수 있도록 지원한다. 또한, 규칙적인 문법과 적은 특수 사례를 통해 학습과 편집을 용이하게 한다. 설치는 간단한 커맨드로 가능하며, 관련 연구와 비교 분석도 함께 제공된다.

https://news.hada.io/topic?id=29780

#aiagents #programminglanguage #vercel #zerolang

Zero - 에이전트를 위한 프로그래밍 언어 | GeekNews

Vercel Labs에서 에이전트가 주 사용자가 되는 환경을 가정해 처음부터 다시 설계된 실험적 프로그래밍 언어즉석에서 배울 수 있고, Inspect·Repair가 결정적이며, 표준 라이브러리 우선, 대부분의 작업에 명백한 한 경로가 있을 만큼 명시적인 언어를 목표로 함컴파일러가 구조화된 진단·복구 정보를 출력해 에이전트가 직접 코드를 점검·수리하도록 지원

GeekNews

catherine1987 21h ago

OpenClaw 3.22 Just Changed AI Agents Forever
The asynchronous orchestration bottleneck has been broken. The 3.22 release of the OpenClaw framework introduces a unified semantic message bus, letting independent worker nodes coordinate, negotiate resources, and self-correct errors with near-zero latency.
#OpenClaw #AIAgents #SoftwareArchitecture #DevOps #Automation #TechTrends
https://www.technology-news-channel.com/openclaw-3-22-just-changed-ai-agents-forever/

OpenClaw 3.22 Just Changed AI Agents Forever

OpenClaw 3.22: The Future of AI Agents is Here OpenClaw 3.22 introduces ClawHub, a revolutionary marketplace that makes adding skills[...]

Technology News

JohnM 21h ago

Google’s New Gemini Update Shocks Microsoft With Powerful New AI

The balance of power has shifted. Google's newly unveiled Gemini 3.5 upgrade has sent shockwaves through Redmond, demonstrating unprecedented advancements in agentic workflows and native multi-step coding that directly threaten Microsoft's Copilot ecosystem.

#google #GoogleGemini #Microsoft #Copilot #TechRivalry #AIAgents #TechNews #tech

https://www.technology-news-channel.com/googles-new-gemini-update-shocks-microsoft-with-powerful-new-ai/

Google’s New Gemini Update Shocks Microsoft With Powerful New AI

Google just rolled out a major Gemini update that could reshape the AI race with Microsoft. The company is pushing[...]

Technology News