I know nobody gives a fuck, but this is my next research topic for this year: Finding #bugs & #vulnerabilities by #diffing binaries against sources. It sounds much harder than it actually is.

#ProgramDiffing #VulnDev #VulnResearch #VulnerabilityDevelopment #VulnerabilityResearch #ReverseEngineering
#Compilers #CompilerOptimizations #CompilersBugs #Miscompilations

The summary: given compilable source code and a binary corresponding to the previously mentioned source code, find the code added by the compiler that doesn't correspond to code in the actual source code, also find the code that is in the source codes *but* was optimized away for the compiler, and then apply some basic rules to determine what smells like a bug or a vulnerability.

How Hard Can It Be (TM)?

Optionally, if I have enough time and it proves to be really useful: use #symbolic #execution to determine if #decompiled code corresponds to original sources code. It doesn't look trivial at all, as codes written by humans tends to be much more verbose, logical, etc, than codes generated by compilers.

In summary: it's hard to compare, say, humans written Abstract Syntax Trees against the #AST given by an optimising #decompiler taking as input code optimised by a #compiler.

@joxean and you get to have fun with preprocessor differences!
@wirepair I hope I don't have to deal with that using the clang bindings with *compilable* source codes.
Silent Bugs Matter: A Study of Compiler-Introduced Security Bugs | USENIX

@freddy @wirepair yes! They cover a part of what I want to do.