Later on, I realised this technology could also be used to fuzzily compare texts to determine if they look similar, I tested this technology in #Diaphora as a mean to determine if two pseudo-codes are 'similar' by comparing the 3 fuzzy hashes #DeepToad calculates and it turns out it worked much better than other approaches I tested (and better than expected!) so, finally I integrated it into the public version of Diaphora.

#FuzzyHashing
#FuzzyMatching
#ApproximateStringMatching

Today I realised that the oldest technology developed by me integrated into #Diaphora dates from 2009.

In case you are curious, it's #DeepToad, a Python library for doing fuzzy hashing. This simplistic library calculates a set of 3 different hashes using a configurable block size (in opposite to, say, ssdeep, that doesn't work for this).

https://github.com/joxeankoret/deeptoad

#FuzzyHashing
#DeepToad
#Diaphora
#BinDiffing
#ProgramDiffing
#BCSA

GitHub - joxeankoret/deeptoad: DeepToad is a library and a tool to clusterize similar files using fuzzy hashing

DeepToad is a library and a tool to clusterize similar files using fuzzy hashing - GitHub - joxeankoret/deeptoad: DeepToad is a library and a tool to clusterize similar files using fuzzy hashing

GitHub