I know nobody gives a fuck, but this is my next research topic for this year: Finding #bugs & #vulnerabilities by #diffing binaries against sources. It sounds much harder than it actually is.

#ProgramDiffing #VulnDev #VulnResearch #VulnerabilityDevelopment #VulnerabilityResearch #ReverseEngineering
#Compilers #CompilerOptimizations #CompilersBugs #Miscompilations

The summary: given compilable source code and a binary corresponding to the previously mentioned source code, find the code added by the compiler that doesn't correspond to code in the actual source code, also find the code that is in the source codes *but* was optimized away for the compiler, and then apply some basic rules to determine what smells like a bug or a vulnerability.

How Hard Can It Be (TM)?

Optionally, if I have enough time and it proves to be really useful: use #symbolic #execution to determine if #decompiled code corresponds to original sources code. It doesn't look trivial at all, as codes written by humans tends to be much more verbose, logical, etc, than codes generated by compilers.

In summary: it's hard to compare, say, humans written Abstract Syntax Trees against the #AST given by an optimising #decompiler taking as input code optimised by a #compiler.

Well, let's see how it goes... Also, I hope the #clang #bindings (probably the #Python ones) got significantly better since the last time I used them (~2018).
@joxean sounds like really cool research!
I wonder what it's going to uncover.
@joxean and you get to have fun with preprocessor differences!
@wirepair I hope I don't have to deal with that using the clang bindings with *compilable* source codes.
@joxean Good luck! I'm sure you'll find interesting things.
@joxean sounds like a use-case for your Pigaios tool? :)
@grayfox Is similar, but it will require fully compilable source codes to prevent false positives.
@joxean That's a good point! Although I can imagine you could get interesting results from source leaks and partial open sourced (apple), too. Even if more work to sort out FP.
@grayfox To begin with this project, I will focus on compilable source codes. I do have the knowledge to try with partially compilable ones but... I will try to make it as simple as possible at the beginning.
@joxean makes absolutely sense :D I tend to fall into the trap to reach directly for complex solutions 😅
@grayfox Hahaha, that's a very easy to fall trap for us.
Finding Undefined Behavior Bugs by Finding Dead Code – Embedded in Academia

@joxean Do you expect this to find bugs? I guess I’m wondering if you’ve found examples of compilers mucking things up and introducing vulnerabilities regularly
@joxean (As opposed to, say, proprietary closed-source modifications to a mostly open source codebase)
@saagar Is not that I expect it to find bugs, I am writing a tool for a "technique" that is known to work since I can remember.
@joxean actually, I do care and I find the idea interesting :D I wonder how you're going to automate this. Do let us know when you're publishing!