Mastodawn

Research-Driven Agents: When an agent reads before it codes

https://blog.skypilot.co/research-driven-agents/

Research-Driven Agents: What Happens When Your Agent Reads Before It Codes

Coding agents working from code alone generate shallow hypotheses. Adding a research phase — arxiv papers, competing forks, other backends — produced 5 kernel fusions that made llama.cpp CPU inference 15% faster.

SkyPilot Blog

Show thread

simlevesque 1d ago

I've been making skills from arxiv papers for a while. I have a one for multi-object tracking for example. It has a SKILL.md describing all important papers (over 30) on the subject and a folder with each paper's full content as reStructuredText.

To feed Arxiv papers to LLMs I found that RST gives the best token count/fidelity ratio. Markdown lacks precision. LateX is too verbose. I have a script with the paper's urls, name and date that downloads the LateX zips from Arxiv, extracts it, transforms them to RST and then adds them to the right folder. Then I ask a LLM to make a summary from the full text, then I give other LLMs the full paper again with the summary and ask them to improve on and and proofread them. While this goes on I read the papers myself and at the end I read the summaries and if I approve them I add it to the skill. I also add for each paper info on how well the algorithms described do in common benchmarks.

I highly recommend doing something similar if you're working in a cutting-edge domain. Also I'd like to know if anyone has recommendations to improve what I do.

Show thread

satvikpendem 1d ago

Does that even fit in the context? It seems like 30 papers worth of content would just overflow it.

Show thread

ctoth

For each paper, have your agent extract a three sentence description, create a description.md, then concat those with the paper names into an INDEX.md which it should consult to find appropriate papers. Also: have your agent tag papers, then autogenerate your tagged collection on the filesystem. Then you get nice things like https://github.com/ctoth/Qlatt/tree/master/papers/tagged

Then something in your {CLAUDE,AGENTS}.md that says: when working on something with relevant context supplied by papers, read the papers before doing the work. You can find all papers plus their descriptions in ./papers/INDEX.md and papers by tag in ./papers/tagged

Qlatt/papers/tagged at master · ctoth/Qlatt

Explainable WebAudio Klatt formant synthesizer with declarative TTS frontend and WASM-backed AudioWorklet DSP - ctoth/Qlatt

GitHub