A new breed of analyzers

(See how I cleverly did not mention AI in the title!) You know we have seen more than our fair share of slop reports sent to the curl project so it seems only fair that I also write something about the state of AI when we get to enjoy some positive aspects of this technology. … Continue reading A new breed of analyzers →

daniel.haxx.se
@bagder I am glad you had the patience not to throw this out at the first mention of AI! With all the ridiculous reports you've had to deal with it would be very understandable if you had ignored these
@bagder if they can send you reports, I hope they soon come with fixes too! AI should be able to do both.
@vinifortuna @bagder Fixes are harder for the same reason that generation has seen terrible results irt curl and other open source projects. I'm skeptical thats coming in a small timeframe :P
@daniel:// stenberg:// The question is: what do they mean by "AI"?
@bagder How does it feel to single-handedly keep the AI bubble from imploding? ;)

@bagder This article is the example of what I have been thinking about AI as a whole... It is currently in the state where it can be tooled for domain specific application. However, these are still tools, and they are only as a good as the person(s) operating them, and validating their output.

We are still a very, very long way from AGI. I doubt we will see AGI in any reasonable form in my lifetime.

They are also still in very, very serious need of optimization. The resources required to handle tasks that are domain specific (such as analyzing the curl source code) is very much out-sized compared to the results they are producing.

IMO -- when these applications can be scaled to taking something like the curl codebase and analyzing it on a single computer like a Ryzen Strix Halo machine (which uses ~150 watts of power) with reasonable speed and accuracy, then we will be getting to a point where these tools become worthwhile.

But the hype from the Sam Altman's and other AI-Bro types is actually doing damage to the field of AI research. I read an article from last March where AI Researchers were surveyed at a conference, and the majority of them felt that the current AI research was going in the wrong direction. I wish the AI-Bro types would listen to what the researchers are saying, and dial things back to make certain they get this work right.

@bagder A large part of the benefit of formal methods is that they force you to make an independent description of what the code should do, in a formalism that can later be checked (or refined).

These LLM-based tools seem to be happy to work with natural language specifications (such as code comments or the Telnet specification), which is a major step forward in usability!

@bagder this is the first time I read anything genuinely positive about AI. But how much of the bugreport's quality is due to the tool, and how much due to the human filtering it? So far it looks like all reports came through 2 individuals, so it is probably a combination of a better tool, and someone with the skills to understand the output and throw away the hallucinations (or maybe the tool has a feedback loop to test that the bug is actually real).
I can completely understand the moral dilemma you mention, that is one of the reason why I haven't yet started using any of these tools myself.

@bagder I'm happy to see actual usefulness, that's great. (As an aside, what code editor or IDE are you productive in?)

Edit: Ah, as of 2009, looks like emacs :) https://daniel.haxx.se/blog/2009/01/08/emacs/

emacs!

I haven't said it here before, but I feel I really should. I've been an avid emacs user since I started to learn it back in 1991 on emacs 18. I worked at IBM with their RS/6000 machines at the time and I learned C on AIX with emacs as my editor. To me there … Continue reading emacs! →

daniel.haxx.se

@bagder
> Fixing several hundred bugs per release is a normal rate for us.

That means nothing else than several hundreds bugs written per release at some point in the past or thousands waiting their time. Large code base = exponentially more bugs per line of code.

@bagder very interesting! Thank you for sharing
@bagder @kboyd I want to know how much, in terms of GPU, CPU, RAM, disk, electricity and cooling these analyzers used to generate these reports. And how many results the humans filtered out. And how much that’s going to cost when the VCs demand their 100x returns.