When you visually compare two hashes, how many digits do you check?

The strings hashed below are "retr0id_18b1f814a8e2d9c4fb9c" and "retr0id_1253dea672ebfa240e94", if you want to check for yourself.
This partial collision was computed in 10 minutes on a mid-range desktop GPU. There is no weakness in sha256 being exploited, just an application of the Birthday Paradox.

I used the methodology from this paper https://www.cs.csi.cuny.edu/~zhangx/papers/P_2018_LISAT_Weber_Zhang.pdf "Parallel Hash Collision Search by Rho Method with Distinguished Points" - but with a single GPU instead of a university HPC cluster!

By the way, the attack complexity would be the same no matter how long the prefix is. Furthermore, there could be two different prefixes (i.e. "good" vs "evil"), although the attack would get about 50% more expensive on average.
@retr0id I didn‘t know that was possible, I usually compare the first and last 6 positions which wouldn’t have helped me with these.

@retr0id
> comllexity would be the same no matfer how long the prefix is

Even if the prefix is 256 bits long?

@wolf480pl even if the prefix is 256GB long
@retr0id oh you mean prefix of input not prefix of output?
@retr0id oh yoy meam the prefix of the data being hashed, not the colliding prefix of the hash?

@retr0id I used to just check the first 3 and last 3 hex digits but I've started to suspect that's a scheme that somebody's going to start to attack soon if not already

(Edit: oh, yeah, I see the image now)

@retr0id Usually I check say five at start and five at end. Sometimes I'll go for the whole thing. I realize that birthday-paradox matches should be equally tricky for any N digits, but I figure people rarely aim for just middle different.
@retr0id First 24b, last 24b, rough shape of the middle, then look closer if they're shaped different. This passed the head and tail, but too many letters in one vs. the other.
@retr0id would probably catch me out if I was tired but the "shape" of the characters in the middle is the biggest giveaway here
@retr0id it'd be cool to see collisions that rely upon common character mutations (d/b, 3/8, 5/6, a/e/c) as a way to make it harder to spot that the middle part is wrong.
@gsuberland Yeah that's what I'm looking into next, I think I've come up with an efficient way of doing it
@retr0id this would totally have gotten me. good point
@[email protected] I caught it from the front, my mind merged the whole 22590 cluster into one from the first one, and it ended up being 2258 on the other side

I think you could try to generate hashes such that the first few "blocks" that the mind segments the hash into match though, but need to get at least a heuristic on how those blocks would be segmented like.
@retr0id @fincham first 5-ish, last 5-ish, then pick 5-ish from the middle of one and confirm it appears in the middle of the other
@retr0id Not enough, evidently.

@retr0id uhm yeah.

The first and last couple, but I also put them under each other so I’d get a visual disturbing if there’s a mismatch in the middle, sometimes I even use uniq

@retr0id By arranging the checksums directly below each other, you can spot the differences much easier:
@blackstream If you're going to the effort of doing that, why not *actually* compare them?
@retr0id that would be the best variant indeed.
@retr0id @blackstream To be fair the original post explicitly asked about visual comparison. But yes this is a good wake-up call that proper comparison is becoming essential.
@davebiff @blackstream Right, for me at least, I only do visual comparison when there's no convenient alternative - e.g. the hashes are on two different devices, or maybe one is a VM without clipboard integration.
@retr0id This is why you can hash the hash,
@saagar if I know what second-hash you're using I can target that instead :P

@retr0id So, I was primed by the toot, but after checking the first and last ~6 that I can read in one go, i rely on the sort of shape of the rest. At the 2258 v s 2259 spot, the shapes differ enough that I checked closer.

But I thought the whole point of shas256sum was to use it computationally, like with sha256 -c or something, it's not meant to be easily human comparable.

@retr0id If the hash is unimportant, I just look at 6-8 digits at the beginning or end, whichever is more convenient. If it's important and passes the sniff test, I check every digit. I know it's extremely unlikely to be a near-collision, and it's tedious to do, but the exercise feels important. And if I keep doing it, I'll get better at it, or find better ways of doing it.
@retr0id It is easier to check every digit when the hash are vertically aligned. (command_that_generate_checksum); echo expected_checkum
@retr0id when in a hurry, 4-5 first digits and (importantly) 4-5 last digits.
@retr0id always “git diff --no-index” for this exact reason
@retr0id Is that 80 bits colliding?
@ryanc yup! I will hopefully have a demo of 96 bits colliding by this time tomorrow, depending on how lucky I get
@retr0id We should talk... this reminds me of an old project I tried to do almost a decade ago.
@retr0id ever since I've had to do this in bulk, the only way for me now is to line them up one below the other. And if there's more than two, I run them through sort | uniq -c | sort
@retr0id So after checking n bytes at the beginning and end, we need to do the same with a hash of each hash. Let’s see how long that takes to calculate a collision!