Mastodawn

A couple of weeks ago I used #gemini to whip up a prototype of WFxT support for #qemu: https://patchew.org/QEMU/2026022412101[email protected]/

This week I posted my hand written series doing the same thing: https://patchew.org/QEMU/2026032013060[email protected]/

Compare and contrast the approaches. While #genai can get you a working prototype pretty quickly the result was hard to review and missed an important source of events as well as a sub-optimal implementation. This might not matter for one-shot code but for production it missed the mark.

Show thread

penguin42 2d ago

@stsquad Did it have access to the ARM ARM and other docs as well?

Show thread

Alex 1d ago

@penguin42 only what was in its training data. I have been experimenting with NotebookLM which can have whole PDFs loaded into it so maybe next experiment I could get it to create a reference sheet for the agent first?

However the first thing worth solving would be getting agents to actually follow the instructions in AGENTS.md to use small discreet commits.

Show thread

penguin42 1d ago

@stsquad I think loading the whole ARM-ARM in the context would blow the poor things context; I'm not sure if there's a more subtle way of doing it. The thing about it not following the AGENTS.md; I'd seen some setup where you have another instance of the AI review the patches against the rules as well and then loop around till it's happy.

Show thread

Alex 1d ago

@penguin42 apparently ECA supports sub-agents but I haven't worked out how to use them yet. I do worry that it's yet another thing burning tokens and I'd like to figure out how to stop the main agent occasionally going into a loop repeating slight variations on the same "thought" expansion.

Show thread

penguin42

@stsquad There's an article somewhere about how someone set it up as a little team; so one AI that dealt with the plan; then another that reviewed it; then the programmer, then a code reviewer. I think they gave the reviewer to the pricier-AI