What if you could make your fuzzer ask an LLM about the correct structure and order of protocol messages as specified in hundreds of pages of RFC?

🎉 Accepted @ NDSS'24
📝 https://mpi-softsec.github.io/papers/NDSS24-chatafl.pdf
🧑‍💻 https://github.com/ChatAFLndss/ChatAFL

Led by Ruijie Meng w/ Martin Mirchev and Abhik Roychoudhury

@mboehme

Correct?

But LLMs are random!

Proving correctness needs to be checked conventionally.

"LLMs prove to be effective in
enriching initial seeds" - that sounds pretty different to me. Finding potential faults is necessary, but it doesn't give a correct structure.

@AdeptVeritatis Oh, with "prove to be effective" we just mean "our experiments demonstrate". We just use the capability of an LLM to translate specific aspects of human-written RFCs into machine-readable format. There are self-consistency checks to reduce hallucination, and whichever hallucination remains might even help the fuzzing process (slightly corrupting messages).

@mboehme

With this years models.

But those models will be (economically) worthless next year. So they will be replaced with new models. These models will already contain the hallucinations from today.

And then they are baked into it and you can never do something against it, because the hallucinations have become part of the inner reality of these models. AND YOU ARE NOT IN CONTROL OF THE MODELS.