Mastodawn

This is a great idea! I saw a similar (inverse) idea the other day for pooling compute (https://github.com/michaelneale/mesh-llm). What are you doing for compute in the backend? Are you locked into a cohort from month to month?

GitHub - michaelneale/mesh-llm: reference impl with llama.cpp compiled to distributed inference across machines, with real end to end demo

reference impl with llama.cpp compiled to distributed inference across machines, with real end to end demo - michaelneale/mesh-llm

GitHub

Show thread

mmargenot 4d ago

It is more common now to improve models in agentic systems "in the loop" with reinforcement learning. Anthropic is [very likely] doing this in the backend to systematically improve the performance of their models specifically with their tools. I've done this with Goose at Block with more classic post-training approaches because it was before RL really hit the mainstream as an approach for this.

If you want to look at some of the tooling and process for this, check out verifiers (https://github.com/PrimeIntellect-ai/verifiers), hermes (https://github.com/nousresearch/hermes-agent) and accompanying trace datasets (https://huggingface.co/datasets/kai-os/carnice-glm5-hermes-t...), and other open source tools and harnesses.

Official	https://
Support this service	https://www.patreon.com/birddotmakeup