Let me explain you the idea and the problem.
The Idea: Put a service for functions signatures using #Diaphora technologies online or a SQLite database file that can be easily downloaded somewhere for offline usage.
The Problem: Github doesn't allow files bigger than 100 MB. The initial signatures database I have is 1,2 GB, but it will grow a bit (I don't think it will be ever over 10 GB in years and years).
That explained, any recommendation about how to solve this problem with my resources?
What is a *cheap* way to put an online service that requires a database that will be under ~10 GB for a number of years? If it's free, better.
The code is also published (in github) already and #Diaphora now can use an already trained model to try to improve binary diffing results (matching). I haven't made yet a new release just yet as these changes are considered a bit experimental for now.
The datasets and tools for training and testing are here: https://github.com/joxeankoret/diaphora-ml
And Diaphora, is here: https://github.com/joxeankoret/diaphora
#Diaphora #BinaryDiffing #Bindiffing #ReverseEngineering #MachineLearning
Here are the slides of my "Simple Machine Learning Techniques for Binary Diffing (with Diaphora)" talk given at the @44CON conference last week:
https://github.com/joxeankoret/diaphora-ml/blob/main/docs/diaphora-ml-techniques-44con-final.pdf
#44con #Diaphora #MachineLearning #ReverseEngineering #BinaryDiffing