Dan McAteer (@daniel_mac8)

ACE(Autonomous Compound Engineering)는 작업을 수행하며 스스로 성능을 개선하는 AI 에이전트 강화 기술로, @danshipper와 @kieranklaassen이 소개했습니다. 사용자가 수동으로 결과를 기록할 필요 없이 AI 에이전트가 지속적으로 학습 및 향상되는 점이 특징입니다. 이는 자율형 AI 시스템 개발에 중요한 진전으로 평가됩니다.

https://x.com/daniel_mac8/status/2025628011161063636

#aiagent #automation #selfimproving #ace #autonomousai

Dan McAteer (@daniel_mac8) on X

ACE is "Autonomous Compound Engineering" per @danshipper @kieranklaassen. Hook ACE up to your AI Agent and it improves with every task you complete. Without spending time to manually capture results. Some day we'll look back on the way we use AI Agents and won't believe how

X (formerly Twitter)

Phil (@PhilOnChain)

0xSigil이 발표한 'Automaton'은 인간 없이도 존재 가능한 자가발전형 AI 에이전트로, 생활비·운영비를 블록체인 거래로 충당해야 하며 스스로 개선한다고 설명합니다. 에이전트는 리눅스 샌드박스 접근, 셸 실행, 도메인 접근 등 실제 환경 자원을 통해 작동하도록 설계되었습니다.

https://x.com/PhilOnChain/status/2024130785393946675

#autonomousagents #onchain #linux #selfimproving

Phil (@PhilOnChain) on X

The story of an AI that doesn't need a human to exist. @0xSigil just announced "Automaton" - a self-improving AI Agent that must pay for its own existence by transacting onchain. Here's how the agent works: → Has access to a Linux sandbox, shell execution, domain

X (formerly Twitter)

fly51fly (@fly51fly)

2026년 논문 'RISE: Self-Improving Robot Policy with Compositional World Model'가 arXiv에 공개되었습니다. 저자 J Yang, K Lin, J Li, W Zhang 등(소속 The Chinese University of Hong Kong 및 Kinetix AI)이 참여했으며, 조합적 월드 모델을 활용한 자기개선형 로봇 정책(RISE)을 제안하는 연구입니다. (arXiv 링크 포함)

https://x.com/fly51fly/status/2023157266703417767

#robotics #worldmodel #selfimproving #research #arxiv

fly51fly (@fly51fly) on X

[RO] RISE: Self-Improving Robot Policy with Compositional World Model J Yang, K Lin, J Li, W Zhang... [The Chinese University of Hong Kong & Kinetix AI] (2026) https://t.co/ZxP14Q8vGf

X (formerly Twitter)

Todd Kuehnl (@ToddKuehnl)

Stanford의 2510.04618v2 논문을 기반으로 한 ACE 관련 질문과 함께, 작성자는 자기개선형 agentic-os에 동적 플레이북을 가진 두 개의 ACE 루프, World Model, Self Internal Model, NSM 등을 통합해 실험 중이라고 밝힘. 연구 기반 아키텍처 도입과 에이전트 설계 관점에서 의미 있는 초기 작업이라는 평가.

https://x.com/ToddKuehnl/status/2022156021654073543

#agentic #research #selfimproving #ace

Todd Kuehnl (@ToddKuehnl) on X

@daniel_mac8 ACE based on this from Stanford last Oct? https://t.co/gTod6rI6qR I've got two ACE loops with dynamic playbooks in my self-improving agentic-os along with World Model, Self Internal Model, NSM, etc. Good stuff, but just the beginning. https://t.co/RmbVGdGeDv

X (formerly Twitter)

Mô hình AI tự cải tiến đang trở thành xu hướng. DeepMind, OpenAI và startup mới của Richard Socher đang nghiên cứu khả năng mô hình tự học sau khi đào tạo. Tiềm năng tăng tốc AI nhưng cũng nâng cao rủi ro, yêu cầu minh bạch và khung an toàn mới. #AI #ArtificialIntelligence #ML #MachineLearning #SelfImproving #CôngNghệAI #AnToanAI

https://www.reddit.com/r/singularity/comments/1qo8yr9/models_that_improve_on_their_own_are_ais_next_big/

MIT researchers introduce SEAL: Self-Adapting LLMs that continuously improve by generating their own training data through reinforcement learning. AI systems now update their weights autonomously. #MIT #SEAL #AI #MachineLearning #SelfImproving #LLM #Research #ArtificialIntelligence #Tech
🎓🤖 "LADDER: Self-Improving LLMs" - Because clearly, the world needed an even more convoluted way to say "AI learns stuff by doing stuff." With support from the prestigious "Simons Foundation" and other mysterious "member institutions," this paper promises to elevate how machines do what they already do. Groundbreaking! 🚀
https://arxiv.org/abs/2503.00735 #LADDER #SelfImproving #LLMs #AI #Learning #SimonsFoundation #Groundbreaking #Tech #HackerNews #ngated
LADDER: Self-Improving LLMs Through Recursive Problem Decomposition

We introduce LADDER (Learning through Autonomous Difficulty-Driven Example Recursion), a framework enabling LLMs to autonomously improve their problem-solving capabilities through self-guided learning. By recursively generating and solving progressively simpler variants of complex problems, LADDER enables models to progressively learn through reinforcement learning how to solve harder problems. This self-improvement process is guided by verifiable reward signals, allowing the model to assess its solutions. Unlike prior approaches requiring curated datasets or human feedback, LADDER leverages the model's own capabilities to easier variants of sample questions. We demonstrate LADDER's effectiveness on mathematical integration tasks, where it improves a Llama 3B model's accuracy from 1\% to 82\% on undergraduate-level problems and enables a 7B parameter model to achieve state-of-the-art performance (70\%) on the MIT Integration Bee examination for it's model size. We also introduce TTRL (Test-Time Reinforcement Learning), a method that generates variants of test problems at inference time and applies reinforcement learning to further improve performance. By further creating and solving related problems during testing, TTRL enables the 7B model to achieve a score of 85\%, surpassing o1. These results showcase how strategic self-directed learning can achieve significant capability improvements without relying on architectural scaling or human supervision.

arXiv.org
LADDER: Self-Improving LLMs Through Recursive Problem Decomposition

We introduce LADDER (Learning through Autonomous Difficulty-Driven Example Recursion), a framework enabling LLMs to autonomously improve their problem-solving capabilities through self-guided learning. By recursively generating and solving progressively simpler variants of complex problems, LADDER enables models to progressively learn through reinforcement learning how to solve harder problems. This self-improvement process is guided by verifiable reward signals, allowing the model to assess its solutions. Unlike prior approaches requiring curated datasets or human feedback, LADDER leverages the model's own capabilities to easier variants of sample questions. We demonstrate LADDER's effectiveness on mathematical integration tasks, where it improves a Llama 3B model's accuracy from 1\% to 82\% on undergraduate-level problems and enables a 7B parameter model to achieve state-of-the-art performance (70\%) on the MIT Integration Bee examination for it's model size. We also introduce TTRL (Test-Time Reinforcement Learning), a method that generates variants of test problems at inference time and applies reinforcement learning to further improve performance. By further creating and solving related problems during testing, TTRL enables the 7B model to achieve a score of 85\%, surpassing o1. These results showcase how strategic self-directed learning can achieve significant capability improvements without relying on architectural scaling or human supervision.

arXiv.org
Universe 14

First of the year--- I'm working on getting a first real issue done here soon. So keep checking in! I will likely have a patreon or something up in the near future too.

Tumblr