📚 “Comparing binaries with radiff2” - a video tutorial by Mohamed Atta Abozaid (Egypt)
👀 video https://youtu.be/RsI8hNhsi_U
👉source https://github.com/ReEng101/Binary-Comparison
📚 “Comparing binaries with radiff2” - a video tutorial by Mohamed Atta Abozaid (Egypt)
👀 video https://youtu.be/RsI8hNhsi_U
👉source https://github.com/ReEng101/Binary-Comparison
This paper looks promising: "SIGMADIFF: Semantics-Aware Deep Graph Matching for Pseudocode Diffing".
https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=9671&context=sis_research
The code is also published (in github) already and #Diaphora now can use an already trained model to try to improve binary diffing results (matching). I haven't made yet a new release just yet as these changes are considered a bit experimental for now.
The datasets and tools for training and testing are here: https://github.com/joxeankoret/diaphora-ml
And Diaphora, is here: https://github.com/joxeankoret/diaphora
#Diaphora #BinaryDiffing #Bindiffing #ReverseEngineering #MachineLearning
Here are the slides of my "Simple Machine Learning Techniques for Binary Diffing (with Diaphora)" talk given at the @44CON conference last week:
https://github.com/joxeankoret/diaphora-ml/blob/main/docs/diaphora-ml-techniques-44con-final.pdf
#44con #Diaphora #MachineLearning #ReverseEngineering #BinaryDiffing
This is not at all my own idea and this is, basically, the only thing that academia researches as of today: almost every single academic paper published in the last years talking about binary diffing (or, as academia calls it "Binary Code Similarity Analysis") is based on "machine learning" techniques.
Some popular academic examples: DeepBinDiff or BindiffNN. Don't worry if you don't know them. Nobody uses them. At all.
Any cool bug in Microsoft's February 2024 Patch Tuesday??
It's very sad, but it's always a damn waste of time reading academic research about binary diffing or, as it's called at the academia, about binary code similarity analysis. It's either all fairytales that cannot be proved or, plainly, false and/or wrong.
An example? One paper that I have re-read today says that #BinDiff and #Diaphora are mono-architecture and totally discard these tools for the paper. LOL.
Fun Reverse Engineering problem du jour. A compilation unit is a set of functions. Cool. However, a function might belong to one or many compilation units.
For example, in #Diaphora, I used to think that once I have a compilation unit name for a function, that function belongs to just that one CU. However, if a function from, for example, a header file is in-lined inside a function, what compilation unit does that function belong to?
The support for finding fixed signedness issues in #Diaphora is working (to highlight potentially fixed vulnerabilites):