Damn those Mythos benchmarks seem very promising
@pojntfx have you actually seen qwen perform this well? or are you basing that comment on benchmarks?
i think the mythos benchmarks only have to be "some amount better" at finding 0days than the current public models to justify them waiting on ga... quite a few maintainers are already swamped.
@pojntfx nod. it does have me thinking hard about other forms of baked-in safety. i'll admit this is the first point in my career where i've ever taken elixir seriously.
(well, ok, not really... @abnv ran a team at nilenso that did some amazing work with it for an quiz app that ran in parallel to a tv show. but i've never previously been tempted to learn it.)