Later on, I realised this technology could also be used to fuzzily compare texts to determine if they look similar, I tested this technology in #Diaphora as a mean to determine if two pseudo-codes are 'similar' by comparing the 3 fuzzy hashes #DeepToad calculates and it turns out it worked much better than other approaches I tested (and better than expected!) so, finally I integrated it into the public version of Diaphora.

#FuzzyHashing
#FuzzyMatching
#ApproximateStringMatching

I developed this technology (#DeepToad) back in the day, during my spare time, to help me cluster malware samples when I was heavily doing malware analysis, both for an antivirus company and for myself as a hobby.

Basically, I used this tool to create clusters of similarly looking samples (at binary level), so instead of having to analyse, say, 100 malware samples, I only had to analyse one malware sample per cluster.

#MalwareClusterization
#MalwareAnalysis
#ReverseEngineering

Today I realised that the oldest technology developed by me integrated into #Diaphora dates from 2009.

In case you are curious, it's #DeepToad, a Python library for doing fuzzy hashing. This simplistic library calculates a set of 3 different hashes using a configurable block size (in opposite to, say, ssdeep, that doesn't work for this).

https://github.com/joxeankoret/deeptoad

#FuzzyHashing
#DeepToad
#Diaphora
#BinDiffing
#ProgramDiffing
#BCSA

GitHub - joxeankoret/deeptoad: DeepToad is a library and a tool to clusterize similar files using fuzzy hashing

DeepToad is a library and a tool to clusterize similar files using fuzzy hashing - GitHub - joxeankoret/deeptoad: DeepToad is a library and a tool to clusterize similar files using fuzzy hashing

GitHub