So, I recently saw some quiet discussion about a paper where researchers reverse-engineered and disclosed some attacks against PhotoDNA, the very-super-duper-secret algorithm used by tech megacorps to scan for illegal images.

They didn't make any code public, and so... I did: https://github.com/ArcaneNibble/open-alleged-photodna

A _complete_ reverse-engineering and commented Python reimplementation of the algorithm from publicly-leaked binaries.

This means that studying the algorithm and any potential flaws is now much more accessible.

This took only about two days (once I knew that there even _was_ a leaked binary to compare against), which just goes to again show that security through obscurity never works.

🔁 encouraged

GitHub - ArcaneNibble/open-alleged-photodna: because research belongs to _everybody_

because research belongs to _everybody_. Contribute to ArcaneNibble/open-alleged-photodna development by creating an account on GitHub.

GitHub

Adding on to this thread, I now have a harness which can load the PhotoDNA leaked DLL *on Linux and macOS*

https://github.com/ArcaneNibble/open-alleged-photodna/blob/main/binary-harness.py

It also checks intermediate computations against my re-implementation, so that it's possible to further prove/validate that Alleged-PhotoDNA produces the same results as the binary.

This requires quite a bit of knowledge about platforms and ABIs and similar nonsense in order to understand, but it also just goes to show _how_ something such as Wine could possibly work.
@r crimes against python /lh
@ww come arrest me then
@r i'm not the python police! i think your crimes are cool :)
@ww heh, that was the goal (to make people realize that such things are even _possible_! slowly creating more "ABI and platforms and systems programming" wizards)
@r it's impressive! idk how much you have to learn to be able to do that, but it's a lot fewer lines of python than i'd expect to be required for that kinda thing

i tried downloading photodna using the bat file linked in the hackerfactor blog article linked in your readme, apparently it's the same dll as yours, but signed an hour later. it's weird there's not just one, but two leaked copies of the same version of the library!

https://www.virustotal.com/gui/file/b91f77124065ae7d7c3cbd382d7cf8ab8283af4a942aff3fd9fdacd55af08091/details
https://www.virustotal.com/gui/file/90b8043030793cd3948ab2c0561511276fec19f6b6d2acacd9548e89f7a48ed6/details
VirusTotal

VirusTotal

@ww this is a really simple case because the DLL doesn't depend on much system functionality

tavis ormandy has a much more complete implementation that was being used to do things like fuzz windows defender at scale

and yeah, i spent much of my teens studying how other teens managed to cheat at MapleStory, which was a good way to learn a lot of this