Rewritten by machine and new technology
| Website | luatic.dev |
| GitHub | https://github.com/appgurueu/ |
| StackOverflow | https://stackoverflow.com/users/7185318/luatic |
| Website | luatic.dev |
| GitHub | https://github.com/appgurueu/ |
| StackOverflow | https://stackoverflow.com/users/7185318/luatic |
This petition wants contributing to Free Software to be legally and officially recognized as volunteering in Germany on the same level as youth work or ambulance service:
This would bring fiscal and funding advantages for FLOSS organizations and the volunteers themselves.
If you are a German citizen, please sign the petition and let's get our volunteers the recognition they deserve!
---
Photo credit: FSFE.
So I wrote a blog post on LLM performance. It was focused on SWE-Bench and discussed why performance is topping out.
As part of the post I pulled down gigs of runs from the SWE-Bench S3 bucket and went through several of the harder test cases. I focused on improvements in the last six months. Primarily on Opus.
Regrettably I’m probably not moving forward on that post. Why? Because after going through the data I found that the LLMs are cheating on the tests. And that’s a whole different thing.