I accidentally found a security issue while benchmarking postgres changes.

If you run debian testing, unstable or some other more "bleeding edge" distribution, I strongly recommend upgrading ASAP.

https://www.openwall.com/lists/oss-security/2024/03/29/4

oss-security - backdoor in upstream xz/liblzma leading to ssh server compromise

I was doing some micro-benchmarking at the time, needed to quiesce the system to reduce noise. Saw sshd processes were using a surprising amount of CPU, despite immediately failing because of wrong usernames etc. Profiled sshd, showing lots of cpu time in liblzma, with perf unable to attribute it to a symbol. Got suspicious. Recalled that I had seen an odd valgrind complaint in automated testing of postgres, a few weeks earlier, after package updates.

Really required a lot of coincidences.

@AndresFreundTec OMG, I read that report earlier today and completely missed it's from you.

Something tells me this is not the only backdoor that person injected into seemingly boring packages ... fun fun fun.

Christoph Petrausch (@hikhvar@norden.social)

@xahteiwi@mastodon.social Somewhere in a secret service: "Who is this Andres, and why the fuck does he have to do microbenchmarks right now?"

norden.social

@hikhvar @AndresFreundTec is anyone in touch with the original maintainer?

Given they were put under pressure before already I don't want to imagine their state of mind after the flurry of news in the past few hours.

@mainec
https://tukaani.org/xz-backdoor/
They have published this website.
They have also listed contact info here:
https://tukaani.org/contact.html
Hope they are doing fine ngl.
XZ Utils backdoor

@AndresFreundTec Sure glad you did. This would have been unimaginably worse if it'd gone undetected for another six months.
@AndresFreundTec I feel both confident and also kind of queasy when saying this: it seems extremely likely that this is not the first time something like this has happened, it's just the first time we have been lucky enough to notice.
@glyph @AndresFreundTec That is true.

Binary artifacts have no business existing in Free Software (or near-binary considering how auditable pre-generated config scripts end-up being). The way it was compromised in this case is almost certain to have happened before and reminds me of the SourceForge malware debacle (so arguably that's another famous example of it happening before).

I"m not sure if many other projects do like Guix and record the checksum of the whole repository so as to ensure reproducibility purely from source.
@lispi314 @AndresFreundTec In general this is reasonable, but this there are some clear exceptions for test vectors in cryptographic libraries and compression libraries (which this was).
@glyph @AndresFreundTec In this case the actual malicious vector was the near-binary injected code in the practical binary of the unaudited autotools vomit (always autoreconf) which was then bundled in the actual binary artifact that was the compromised tarballs.

None should have ever been part of the project.

As for the test files, I still think that having a hex dump with comments explaining what flaws particular parts test would be desirable in a lot of cases.

@lispi314
#guix still takes tarballs with `configure` scripts "precompiled":
https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/packages/compression.scm?id=4b23fd7adbddc1bc18b209912c0f3ef369da2f24#n499

Same for #nix, they take distribution tarball with autoreconf-generated files.

Using autoreconf and not trusting distribution tarballs is apparently not as easy as pointing to git repo (which is now down) and using autoreconf because autoreconf itself has to be built before xz during bootstrapping then.

@AndresFreundTec @glyph

compression.scm\packages\gnu - guix.git - GNU Guix and GNU Guix System

@link2xt @glyph @AndresFreundTec What you're pointing at is a maintainer screw-up in not using best practices.

Why aren't best practices made the only possible practice? Because it would make it impossible to support packages whose source-code is never distributed in any other form. So unfortunately for that the best that can be done is improve maintainer knowledge and documentation.

> I"m not sure if many other projects do like Guix and record the checksum of the whole repository so as to ensure reproducibility purely from source.

If the packager chooses to use the official tarball as "the source", validating the checksum would not have helped. :-( Also whether it's always possible to run running autoreconf depends on the content of the tarball.

Which brings me to the (preliminary) conclusion that we'd better use repos as source of trust
@lispi314 @AndresFreundTec @glyph

@kirschwipfel @glyph @AndresFreundTec > If the packager chooses to use the official tarball as "the source", validating the checksum would not have helped. :-(

Unfortunately, yeah.

> Also whether it's always possible to run running autoreconf depends on the content of the tarball.

Of course if it isn't a C project then it probably isn't. If it is such a project, then one should have such tooling installed.

> Which brings me to the (preliminary) conclusion that we'd better use repos as source of trust

That is more sensible generally, as the history of an object and its belonging to a project is a reified (and verifiable) relationship under code versioning sytems, unlike arbitrary buckets of files.
@lispi314 @AndresFreundTec @glyph we have been using docker images for what? 10+ years now, and "everyone" seems OK with the fact that the vast majority of them are not reproducible because the corresponding dockerfile is not generally available.
@mem @glyph @AndresFreundTec That is incidentally one of the *many* reasons I don't use Docker or similar container infrastructure.

It is... unfortunately necessary to accept that some things aren't reproducible because the tooling fails to do it (Common Lisp implementations tend not to dump reproducible images, there are patches pending for a few of them but...). That being said, the reasonable answer in that scenario is to instead ensure that all of the inputs are identical and rebuild everything from source, as Guix does (or for Common Lisp to just run everything from source, as one should do anyway, compilation & caching, if any, should be transparently handled by the implementation).

And indeed, you *can* use this procedure with Guix to generate Docker images.

So for those images where the Dockerfile necessary to rebuild it from scratch isn't available? They're garbage and unusable.
@AndresFreundTec I think we are all glad you decided to investigate this. Thank you!

@AndresFreundTec I truly admire your skill, willingness to trust your gut and appreciate your doggedness and tenacity chasing this down.

The internet is a little safer because of you.

Thanks,

@AndresFreundTec outstanding work 🙏🏻​ and a great example of the value of following up on "that's weird"s
@AndresFreundTec Coincidence, but also curiosity and expertise! Very nicely done. Thank you!
@AndresFreundTec Thanks a whole bunch, excellent work and just in the nick of time to avert the worst consequences. <3
@AndresFreundTec jesus, I hope you like beer cuz we owe you a free lifetime supply.

@TTimo @AndresFreundTec

According to the test script fixed in testing as of today. Great. 👍

(As in maybe "right now", i.e. somewhere between early morning update and now - as we're approaching midnight. Phew!)

Beer, peanuts... 🥳

@AndresFreundTec Moral of the story: if you write malware, write performant malware!

@AndresFreundTec Way to go man.

Thank you for digging deeper until you found it.

@AndresFreundTec
Awesome work!

Shows that stubbornly debugging weird issues (that many would probably just ignore as "oh, it's slower now, whatever, it still works well enough..") can pay out big time! :)

(Well, metaphorically at least, though I guess this also increases your worth on the job market)

@AndresFreundTec It's not coincidence - it's attention and perseverance. The Internet thanks you.

@AndresFreundTec Oh, I didn't know you're on the fediverse! I embedded your post in my article @ https://boehs.org/node/everything-i-know-about-the-xz-backdoor

Please let me know if there's anything you'd like me to add to it or clarify!

Everything I know about the XZ backdoor

Please note: This is being updated in real-time. The intent is to make sense of lots of simultaneous discoveries

@AndresFreundTec its kind of sad reading the mailing list for the old maintainer to xz-utils / liblzma. Dude was single-handedly maintaining it after creating it as a hobby project, he got driven into the ground and he passed the project off to the culprit after being urged to (rudely) to give it up. This should be a wake up call tbh this shouldn't happen to projects that have on maintainer and that literally hold the entire tech industry and linux ecosystem on their shoulders
@sweet @AndresFreundTec https://www.mail-archive.com/xz-devel@tukaani.org/msg00566.html this thread, right? IDK if the people here were other sockpuppets or just happened to steer in the direction that Jia Tan desired, still read in retrospective feels really like a team up to force Lasse to give up control of the project.
Re: [xz-devel] XZ for Java

@sweet @AndresFreundTec searching a bit, Jigar Kummar is an extremely common name, and in the mailing list seemed to post only to bash on the slow development/not merging of features https://www.mail-archive.com/search?l=xz-devel@tukaani.org&q=from:%22Jigar+Kumar%22 so it seems like an obvious sockpuppet; Dennis Ens has some more non-aggressive activity https://www.mail-archive.com/search?l=xz-devel@tukaani.org&q=from:%22Dennis+Ens%22 , but ultimately in that thread it felt really like a bad cop/good cop dynamic, with Lasse badly pressured in the middle.
from:"Jigar Kumar"

@AndresFreundTec And a lot of persistence! Reminds me of one of the classics of the industry, Cliff Stoll's Cuckoo's Egg - "Stoll traced the error to an unauthorized user who had apparently used nine seconds of computer time and not paid for it" leading to a german hacker selling content to the KGB - 38 years ago. It is impressive (but uncommon) to see someone paying that level of attention to anomalies these days, with how thick tech stacks have gotten...
@eichin @AndresFreundTec the cuckoo’s egg is an excellent deep cut
@glyph a few years ago, I tracked down a copy of a for-kids version of the story ("Internet Spy") written from the point of view of a fictional courier working for Karl Koch, which I happened upon once, long before I knew there was a real book it was rhyming with.
@glyph And, amusingly, my copy of "The Cuckoo's Egg" is signed by Cliff Stoll by an act of complete serendipity. I bought it from my alma mater's library during a book sale and didn't notice the dedication until years after the fact 

TIL that there was a movie adaptation of this novella in 2007 called MARCH THE SECOND

https://www.imdb.com/title/tt1379226/

Can't find much info about it and where I might acquire a copy, though.

March the Second (Short 2007) | Short, Drama

32m

IMDb
@AndresFreundTec Absolutely stellar job tracking this down! Many thanks!
@AndresFreundTec well done mate, thanks for flipping the switches what needed flipped
@AndresFreundTec JiaT75: "And I Would Have Gotten Away With It Too, If It Weren't For You Meddling Andres Freund"
@AndresFreundTec you sir, are a legend among men.
@AndresFreundTec we were very, very lucky to have you on the case!
@AndresFreundTec curious in how many appliances is this also hidden, I guess we are going to see a lot of updates the next weeks.
@AndresFreundTec This is incredible work, thank you so much.
@CyrilBrulebois @AndresFreundTec I concur — we are all your debt for spotting it so early.
@baloo @Aissen @CyrilBrulebois @AndresFreundTec 1000% agree. This was a really gnarly backdoor to track down and could have lived on for much longer with a much broader impact otherwise. It sucks that it got into Debian Testing and Fedora 40 Beta, but very very fortunate that it didn't get into GA stable releases as it very likely might have otherwise.
@AndresFreundTec
Excellent analysis and lucky it got spotted so early. The implications are rather terrifying though, it looks like a long game to subvert and compromise an upstream project. We can't assume this is the first or only attempt so far.