| Github | https://github.com/angea |
| Github | https://github.com/corkami |
| Pronouns | he/him |
| Github | https://github.com/angea |
| Github | https://github.com/corkami |
| Pronouns | he/him |
An AI told me today that we’re in 2024, for no obvious reason: I had just asked to look for past occurrences of specific events.
I didn’t expect it to hallucinate on such a small and yet obvious fact.
Did you ever encounter such a trivially wrong hallucination?
Oh dear the entire https://www.lyonlabs.org site is offline *and* excluded from archive.org.
It's a massive archive of vintage and modern GEOS and C64 material a lot of it seemingly not found elsewhere.
To check if a file starts with MZ or GIF, just use file/libmagic.
You don't need AI or Magika for that.
TrID has a lot of heuristics, but a lot of false positives.
Magika is useful in different ways, across binary and source types, and is quite fast. But not useful against weird or adversary files.
Magika is a fast file type identifier that covers many file types, binary formats or source texts.
It's not made to detect adversarial attacks.
It's useful for different things that classic binary scanning can't do at this speed.
Magika was trained on all the file types with enough available samples.
Weird files are out of scope of Magika. It just wasn't trained on them.
It's trivial to inject some data in a file and keep it functional (w/ my tool Mitra, for example).
So take a JPG, inject a lot of JavaScript data, and ...guess what ?
Check it out: https://github.com/corkami/mitra
Of course, it's possible to create weird files that will fool Magika and other tools.
Polymocks, polyglots...
I made quite a few - check my CCC talk last year:
https://speakerdeck.com/ange/fearsome-file-formats-18374bc4-b3f2-429f-862e-2177ab4d7aae