Mastodawn

When you visually compare two hashes, how many digits do you check?

The strings hashed below are "retr0id_18b1f814a8e2d9c4fb9c" and "retr0id_1253dea672ebfa240e94", if you want to check for yourself.

Show thread

David Buchanan Apr 19, 2023

This partial collision was computed in 10 minutes on a mid-range desktop GPU. There is no weakness in sha256 being exploited, just an application of the Birthday Paradox.

I used the methodology from this paper https://www.cs.csi.cuny.edu/~zhangx/papers/P_2018_LISAT_Weber_Zhang.pdf "Parallel Hash Collision Search by Rho Method with Distinguished Points" - but with a single GPU instead of a university HPC cluster!

By the way, the attack complexity would be the same no matter how long the prefix is. Furthermore, there could be two different prefixes (i.e. "good" vs "evil"), although the attack would get about 50% more expensive on average.

Show thread

Christian Drexler Apr 19, 2023

@retr0id I didn‘t know that was possible, I usually compare the first and last 6 positions which wouldn’t have helped me with these.

Show thread

Ange Apr 19, 2023

@retr0id wrong address?

Show thread

Dougall Apr 19, 2023

@Ange @retr0id Missing "www.": http://www.cs.csi.cuny.edu/~zhangx/papers/P_2018_LISAT_Weber_Zhang.pdf

Show thread

Ange Apr 19, 2023

@dougall @retr0id thanks!

Show thread

Wolf480pl Apr 19, 2023

@retr0id
> comllexity would be the same no matfer how long the prefix is

Even if the prefix is 256 bits long?

Show thread

David Buchanan Apr 19, 2023

@wolf480pl even if the prefix is 256GB long

Show thread

Wolf480pl Apr 19, 2023

@retr0id oh you mean prefix of input not prefix of output?

Show thread

Wolf480pl Apr 19, 2023

@retr0id oh yoy meam the prefix of the data being hashed, not the colliding prefix of the hash?

Show thread

David Buchanan Apr 19, 2023

@wolf480pl oh, right, yeah

Show thread

recursive 🏳️‍🌈Apr 19, 2023

@retr0id I used to just check the first 3 and last 3 hex digits but I've started to suspect that's a scheme that somebody's going to start to attack soon if not already

(Edit: oh, yeah, I see the image now)

Show thread

Nicolás Alvarez Apr 19, 2023

@retr0id

Show thread

David Buchanan Apr 19, 2023

@nicolas17 Did you mean to send that twice lol

Show thread

Nicolás Alvarez Apr 19, 2023

@retr0id deleted and resent with alt text 👉👈

Show thread

Eli the Bearded Apr 19, 2023

@retr0id Usually I check say five at start and five at end. Sometimes I'll go for the whole thing. I realize that birthday-paradox matches should be equally tricky for any N digits, but I figure people rarely aim for just middle different.

Show thread

AMS Apr 19, 2023

@retr0id First 24b, last 24b, rough shape of the middle, then look closer if they're shaped different. This passed the head and tail, but too many letters in one vs. the other.

Show thread

Graham Sutherland / Polynomial Apr 19, 2023

@retr0id would probably catch me out if I was tired but the "shape" of the characters in the middle is the biggest giveaway here

Show thread

Graham Sutherland / Polynomial Apr 19, 2023

@retr0id it'd be cool to see collisions that rely upon common character mutations (d/b, 3/8, 5/6, a/e/c) as a way to make it harder to spot that the middle part is wrong.

Show thread

David Buchanan Apr 19, 2023

@gsuberland Yeah that's what I'm looking into next, I think I've come up with an efficient way of doing it

Show thread

genevieve Apr 19, 2023

@retr0id this would totally have gotten me. good point

Show thread

Yaakov Apr 19, 2023

@retr0id @mattcen not enough 😅

Show thread

Ignas Kiela Apr 19, 2023

@[email protected] I caught it from the front, my mind merged the whole 22590 cluster into one from the first one, and it ended up being 2258 on the other side

I think you could try to generate hashes such that the first few "blocks" that the mind segments the hash into match though, but need to get at least a heuristic on how those blocks would be segmented like.

Show thread

💬Apr 19, 2023

@retr0id @fincham first 5-ish, last 5-ish, then pick 5-ish from the middle of one and confirm it appears in the middle of the other

Show thread

robbystk Apr 19, 2023

@retr0id Not enough, evidently.

Show thread

Ariadne Conill 🐰

Apr 19, 2023

@retr0id @RichiH i don’t. i use sha256sum -c.

Show thread

mirabilos Apr 19, 2023

@retr0id uhm yeah.

The first and last couple, but I also put them under each other so I’d get a visual disturbing if there’s a mismatch in the middle, sometimes I even use uniq

Show thread

blackstream #RIPNatenom Apr 19, 2023

@retr0id By arranging the checksums directly below each other, you can spot the differences much easier:

Show thread

David Buchanan Apr 19, 2023

@blackstream If you're going to the effort of doing that, why not *actually* compare them?

Show thread

blackstream #RIPNatenom Apr 19, 2023

@retr0id that would be the best variant indeed.

Show thread

Dave Holland Apr 19, 2023

@retr0id @blackstream To be fair the original post explicitly asked about visual comparison. But yes this is a good wake-up call that proper comparison is becoming essential.

Show thread

David Buchanan Apr 19, 2023

@davebiff @blackstream Right, for me at least, I only do visual comparison when there's no convenient alternative - e.g. the hashes are on two different devices, or maybe one is a VM without clipboard integration.

Show thread

Saagar Jha Apr 19, 2023

@retr0id This is why you can hash the hash,

Show thread

David Buchanan Apr 19, 2023

@saagar if I know what second-hash you're using I can target that instead :P

Show thread

Paul Warren Apr 19, 2023

@retr0id So, I was primed by the toot, but after checking the first and last ~6 that I can read in one go, i rely on the sort of shape of the rest. At the 2258 v s 2259 spot, the shapes differ enough that I checked closer.

But I thought the whole point of shas256sum was to use it computationally, like with sha256 -c or something, it's not meant to be easily human comparable.

Show thread

felix (grayscale) 🐺Apr 19, 2023

@retr0id If the hash is unimportant, I just look at 6-8 digits at the beginning or end, whichever is more convenient. If it's important and passes the sniff test, I check every digit. I know it's extremely unlikely to be a near-collision, and it's tedious to do, but the exercise feels important. And if I keep doing it, I'll get better at it, or find better ways of doing it.

Show thread

Nicolas SAPA

Apr 19, 2023

@retr0id It is easier to check every digit when the hash are vertically aligned. (command_that_generate_checksum); echo expected_checkum

Show thread

Didier Raboud Apr 19, 2023

@retr0id when in a hurry, 4-5 first digits and (importantly) 4-5 last digits.

Show thread

Marc Kleine-Budde Apr 19, 2023

@retr0id Fixed that for you: 😀

Show thread

Andy Herd Apr 19, 2023

@retr0id always “git diff --no-index” for this exact reason

Show thread

Ryan Castellucci (they/them)

Apr 19, 2023

@retr0id Is that 80 bits colliding?

Show thread

David Buchanan Apr 19, 2023

@ryanc yup! I will hopefully have a demo of 96 bits colliding by this time tomorrow, depending on how lucky I get

Show thread

Ryan Castellucci (they/them)

Apr 20, 2023

@retr0id We should talk... this reminds me of an old project I tried to do almost a decade ago.

Show thread

Siguza Apr 19, 2023

@retr0id ever since I've had to do this in bulk, the only way for me now is to line them up one below the other. And if there's more than two, I run them through sort | uniq -c | sort

Show thread

barsteward Apr 20, 2023

@retr0id So after checking n bytes at the beginning and end, we need to do the same with a hash of each hash. Let’s see how long that takes to calculate a collision!