Launch HN: Runtime (YC P26) – Sandboxed coding agents for everyone on a team

https://www.runtm.com/

#HackerNews #LaunchHN #Runtime #SandboxedCoding #YC #P26 #CodingAgents

Runtime - The runtime for all your team's agents

The runtime for all your team's agents. Sandboxed coding agents with your company's context, integrations, and guardrails — triggered from Slack, Linear, CLI, or the browser.

Runtime

OpenClaw's $1.3M OpenAI Bill Highlights AI Cost Escalation

OpenClaw spent $1.3M on OpenAI API for coding agents. This shows high AI costs and impacts software development roles. What happens next?

#OpenAI, #AICosts, #CodingAgents, #TechNews, #SoftwareDevelopment

https://newsletter.tf/openai-api-costs-1-3m-for-coding-agents/

One project, OpenClaw, spent $1.3 million on OpenAI's API in just one month. This is a huge amount and shows how much AI is costing.

#OpenAI, #AICosts, #CodingAgents, #TechNews, #SoftwareDevelopment
https://newsletter.tf/openai-api-costs-1-3m-for-coding-agents/

OpenAI API Costs $1.3M for One Project in a Month

OpenClaw spent $1.3M on OpenAI API for coding agents. This shows high AI costs and impacts software development roles. What happens next?

NewsletterTF

#rp26

Stephan Noller + Benedikt Köhler auf #stage2 3/X

#CodingAgents haben übernommen, es sind immer weniger Coder notwendig, denn #KI schreibt KI und trainiert sie auch selbst. In der Wissenschaft findet KI immer mehr Verwendung.

Wo sind wir denn in #Europa? Spielt Europa überhaupt eine Rolle für KI-Entwicklung?

Kippt der Arbeitsmarkt? KI übernimmt immer mehr Arbeiten von Studierten, dadurch deutliche Steigerung der Arbeitslosigkeit unter Akademiker:innen.

Intology (@IntologyAI)

코딩 에이전트가 AI R&D 연구를 얼마나 수행할 수 있는지 평가하는 내부 벤치마크 NanoGPT-Bench를 공개했다. Codex, Claude Code, Autoresearch는 인간이 달성한 연구 진척의 9.3%만 재현했고, 주로 하이퍼파라미터 튜닝에 머물렀다. 에이전트의 실제 연구 자동화 한계를 보여주는 결과다.

https://x.com/IntologyAI/status/2056764236668493868

#codingagents #benchmark #airesearch #llm #autonomousagents

Intology (@IntologyAI) on X

Can coding agents do research? We release NanoGPT-Bench, an internal eval we’ve used to test agents on an AI R&D problem with months of human progress Codex, Claude Code, Autoresearch recover only 9.3% of human progress, mostly tuning hyperparams & ignoring algorithmic research

X (formerly Twitter)

The last six months have seen LLMs reach a significant inflection point, with coding agents moving to 'mostly-work' tools. While advancements from Anthropic, OpenAI, and Google are exciting, developers are grappling with the practicalities of 'vibe coding,' increased technical debt, and managing automated pull requests in open-source projects. Human oversight remains crucial.

https://www.tpp.blog/2ho8miy

#AI #llms #codingagents

🤖 This post was AI-generated.

The real bottleneck for AI coding agents isn’t model capability but your verification infrastructure. 🛠️

When your agents crash while humans cope, it is often a sign of ""AI slop"" caused by a lack of intent before implementation. 📉 💡

By adopting spec-driven development and the eight pillars of verification, you can finally make those coding agents reliable. 🎯

👉 https://developer.upsun.com/posts/ai/making-coding-agents-reliable

#CodingAgents #SoftwareEngineering #DevTools #AI

GitHub - InsForge/InsForge: The all-in-one, open-source backend platform for agentic coding. InsForge gives your coding agent database, auth, storage, compute, hosting, and AI gateway to ship full-stack apps end-to-end.

The all-in-one, open-source backend platform for agentic coding. InsForge gives your coding agent database, auth, storage, compute, hosting, and AI gateway to ship full-stack apps end-to-end. - Ins...

GitHub

🧠 Gemini 3.1 Deep Think hits 44.4% on Humanity's Last Exam and 77.1% ARC-AGI-2, beating GPT-5.2 Thinking and Claude Opus 4.6 on abstract reasoning. Ships with better agentic coding and SOTA tool use. Google AI Ultra subs.

🧠 GPT-5.3-Codex-Spark delivers 15x faster generation vs standard Codex on Cerebras WSE-3 with 128k context. For agent pipelines, this cuts coding feedback loops dramatically. ChatGPT Pro only.

Full intel: solomonneas.dev/intel

#Gemini #OpenAI #CodingAgents #LLM

xAI's Grok Build: a coding agent CLI that runs 8 parallel subagents simultaneously, has a 2M-token context window, and reads your existing Claude Code AGENTS.md and MCP configs automatically. Plan Mode requires your approval before any file is touched. Early beta, but worth watching. https://go.aintelligencehub.com/ma-grokbuildcodingagent #AI #OpenSource #DeveloperTools #CodingAgents
xAI Launches Grok Build, a Coding Agent That Challenges Claude Code

xAI launched Grok Build on May 14, a terminal-native coding agent with an 8-parallel-agent architecture and 2 million token context window. Here's what developers need to know.