Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning
Skill1은 강화학습을 통해 언어 모델 에이전트의 스킬 선택, 활용, 증류를 단일 정책으로 통합하여 공동 진화시키는 프레임워크입니다. 이 접근법은 단일 작업 결과 신호를 활용해 세 가지 능력을 동시에 최적화하며, ALFWorld와 WebShop 환경에서 기존 스킬 기반 및 강화학습 기법보다 우수한 성능을 보였습니다. 실험 결과는 세 능력의 공동 진화와 각 신호의 중요성을 확인시켜 줍니다. 이는 에이전트의 지속적 스킬 라이브러리 관리와 재사용을 위한 효과적인 방법론입니다.
https://arxiv.org/abs/2605.06130
#reinforcementlearning #skillaugmentedagents #llmagents #multitasklearning #agentframework

Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning
A persistent skill library allows language model agents to reuse successful strategies across tasks. Maintaining such a library requires three coupled capabilities. The agent selects a relevant skill, utilizes it during execution, and distills new skills from experience. Existing methods optimize these capabilities in isolation or with separate reward sources, resulting in partial and conflicting evolution. We propose Skill1, a framework that trains a single policy to co-evolve skill selection, utilization, and distillation toward a shared task-outcome objective. The policy generates a query to search the skill library, re-ranks candidates to select one, solves the task conditioned on it, and distills a new skill from the trajectory. All learning derives from a single task-outcome signal. Its low-frequency trend credits selection and its high-frequency variation credits distillation. Experiments on ALFWorld and WebShop show that Skill1 outperforms prior skill-based and reinforcement learning baselines. Training dynamics confirm the co-evolution of the three capabilities, and ablations show that removing any credit signal degrades the evolution.






.NET Blog
