Mastodawn

gregorum Apr 1, 2024

Thank you open source for the transparency.

∟⊔⊤∦∣≶Apr 2, 2024

I have heard multiple times from different sources that building from git source instead of using tarballs invalidates this exploit, but I do not understand how. Is anyone able to explain that?

If malicious code is in the source, and therefore in the tarball, what’s the difference?

Because m4/build-to-host.m4, the entry point, is not in the git repo, but was included by the malicious maintainer into the tarballs.

∟⊔⊤∦∣≶Apr 2, 2024

Tarballs are not built from source?

The tarballs are the official distributions of the source code. The maintainer made the source code ignore the malicious entry point while retaining it inside these distributions.

All of this would be avoided if Debian downloaded from GitHub's distributions of the source code.

Corngood Apr 2, 2024

All of this would be avoided if Debian downloaded from GitHub’s distributions of the source code, albeit unsigned.

In that case they would have just put it in the repo, and I’m not convinced anyone would have caught it. They may have obfuscated it slightly more.

It’s totally reasonable to trust a tarball signed by the maintainer, but there probably needs to be more scrutiny when a package changes hands like this one did.

barsoap Apr 2, 2024

Downloading from github is how NixOS avoided getting hit. On unstable, that is, on stable a tarball gets downloaded (EDIT: fixed links).

Another reason it didn’t get hit is that the exploit is debian/redhat-specific, checking for files and env variables that just aren’t present when nix builds it. That doesn’t mean that nix couldn’t be targeted, though. Also it’s a bit iffy that replacing the package on unstable took in the order of 10 days which is 99.99% build time because it’s a full rebuild. Much better on stable but it’s not like unstable doesn’t get regular use by people, especially as you can mix+match when running NixOS.

It’s probably a good idea to make a habit of pulling directly from github (generally, VCS). Nix checks hashes all the time so upstream doing a sneak change would break the build, it’s more about the version you’re using being the one that has its version history published. Also: Why not?

Overall, who knows what else is hidden in that code, though. I’ve heard that Debian wants to roll back a whole two years and that’s probably a good idea and in general we should be much more careful about the TCB. Actually have a proper TCB in the first place, which means making it small and simple. Compilers are always going to be an issue as small is not an option there but the likes of http clients, decompressors and the like? Why can they make coffee?

nixpkgs/pkgs/tools/compression/xz/default.nix at nixos-unstable · NixOS/nixpkgs

Nix Packages collection & NixOS. Contribute to NixOS/nixpkgs development by creating an account on GitHub.

GitHub

chameleon Apr 2, 2024

You're looking at the wrong line. NixOS pulled the compromised source tarball just like nearly every other distro, and the build ends up running the backdoor injection script.

It's just that much like Arch, Gentoo and a lot of other distros, it doesn't meet the gigantic list of preconditions for it to inject the sshd compromising backdoor. But if it went undetected for longer, it would have met the conditions for the "stage3"/"extension mechanism".

nixpkgs/pkgs/tools/compression/xz/default.nix at d8fe5e6c92d0d190646fb9f1056741a229980089 · NixOS/nixpkgs

Nix Packages collection & NixOS. Contribute to NixOS/nixpkgs development by creating an account on GitHub.

GitHub

barsoap Apr 2, 2024

You’re looking at the wrong line.

Never mind the lines I linked to I just copied the links from search.nixos.org and those always link to the description field’s line for some reason. I did link to unstable twice though this is the correct one, as you can see it goes to tukaani.org, not github.com. Correct me if I’m wrong but while you can attach additional stuff (such like pre-built binaries) to github releases the source tarballs will be generated from the repository and a tag, they will match the repository. Maybe you can do some shenanigans with rebase which should be fixed.

nixpkgs/pkgs/tools/compression/xz/default.nix at nixos-23.11 · NixOS/nixpkgs

Nix Packages collection & NixOS. Contribute to NixOS/nixpkgs development by creating an account on GitHub.

GitHub

chameleon Apr 2, 2024

For any given tag, GitHub will always have an autogenerated "archive/" link, but the "release/" link is a set of maintainer-uploaded blobs. In this situation, those are the compromised ones. Any distro pulling from an "archive/" link would be unaffected, but I don't know of any doing that.

The problem with the "archive/" links is that GitHub reserves the right to change them. They're promising to give notice, but it's just not a good situation. The "release/" links are only going to change if the maintainer tries something funny, so the distro's usual mechanisms to check the hashes normally suffice.

NixOS 23.11 is indeed not affected.

Update on the future stability of source code archives and hashes

A look at what happened on January 30, what measures we’re putting in place to prevent surprises, and how we’ll handle future changes.

The GitHub Blog

barsoap Apr 2, 2024

They’re promising to give notice, but it’s just not a good situation.

cache.nixos.org keeps all sources so once hydra has ingested something it’s not going away unless nixos maintainers want it to. The policy for decades was simply “keep all derivations” but in the interest of space savings it has recently been decided to do a gc run, meaning that 22 year old derivations will still available but you’re going to have to build them from the cached source, the pre-built artifacts will be gone.

Upcoming Garbage Collection for cache.nixos.org

Dear NixOS Community, we write to share important news regarding an upcoming garbage collection process scheduled for end of February on cache.nixos.org. This initiative is driven by our commitment to optimizing the repository and improving overall performance, while also addressing the substantial costs associated with the storage of our build artifacts. As part of our ongoing efforts, we aim to reduce these costs by implementing strategic measures such as garbage collection. In this case, a...

NixOS Discourse

harsh3466 Apr 2, 2024

I don’t understand the actual mechanics of it, but it my understanding is that it’s essentially like what happened with Volkswagon and their diesel emissions testing scheme where it had a way to know it was being emissions tested and so it adapted to that.

The malicious actor had a mechanism that exempted the malicious code when built from source, presumably because it would be more likely to be noticed when building/examining the source.

arthur Apr 2, 2024

The malicious code is not on the source itself, it’s on tests and other files. The building process hijacks the code and inserts the malicious content, while the code itself is clean, So the co-manteiner was able to keep it hidden in plain sight.

sincle354 Apr 2, 2024

So it's not that the Volkswagen cheated on the emissions test. It's that running the emissions test (as part of the building process) MODIFIED the car ITSELF to guzzle gas after the fact. We're talking Transformers level of self modification. Manchurian Candidate sleeper agent levels of subterfuge.

acockworkorange Apr 2, 2024

50 first dates levels of creativity.

Corngood Apr 2, 2024

it had a way to know it was being emissions tested and so it adapted to that.

Not sure why you got downvoted. This is a good analogy. It does a lot of checks to try to disable itself in testing environments. For example, setting TERM will turn it off.

WolfLink Apr 2, 2024

The malicious code wasn’t in the source code people typically read (the GitHub repo) but was in the code people typically build for official releases (the tarball). It was also hidden in files that are supposed to be used for testing, which get run as part of the official building process.

I think it is the other way around. If you build from Tarball then you getting pwned

Subverb Apr 2, 2024

The malicious code was written and debugged at their convenience and saved as an object module linker file that had been stripped of debugger symbols (this is one of its features that made Fruend suspicious enough to keep digging when he profiled his backdoored ssh looking for that 500ms delay: there were no symbols to attribute the cpu cycles to).

It was then further obfuscated by being chopped up and placed into a pure binary file that was ostensibly included in the tarballs for the xz library build process to use as a test case file during its build process. The file was supposedly an example of a bad compressed file.

This “test” file was placed in the .gitignore seen in the repo so the file’s abscense there was explained. Being included as a binary test file means that the malicious code isn’t in the code on github. Its nowhere to be viewed.

The build process then creates some highly obfuscated bash scripts on the fly during compilation which were executed to reassemble the object module, basically replacing the code that you would see in the repo.

Thats a simplified version of why there’s no code to see, and that’s just one aspect of this thing. It’s sneaky.

etchinghillside Apr 2, 2024

Any additional information been found on the user?

Probably Chinese?

Potatos_are_not_friends Apr 2, 2024

Can’t confirm but unlikely.

Wip

jaybone Apr 2, 2024

So this doesn’t really tell us one way or the other who this person is or isn’t.

fluxion Apr 2, 2024

That actually suggests not Chinese due to naming inconsistencies

ForgotAboutDre Apr 2, 2024

Could be Chinese creating reasonable doubt. Making this sort of mistake makes explanations that this wasn’t Chinese sound plausible. Even if evidence other than the name comes out, this rebuttal can be repeated and create confusion amongst the public, reasonable suspicions against accusers and a plausible excuse for other states to not blame China (even if they believe it was China).

Confusion and multiple narratives is a technique carried out often by Soviet, Russian and Chinese government. We are unlikely to be able to answer the question ourselves. It will be up to the intelligence agencies to do that.

If someone wanted to blame China for this, they would take the name of a real Chinese person to do it. There is over a billion real people they could take a name from. It unlikely that a person creating a name for someone for this type of espionage would make a mistake like picking an implausible name accidentally.

fluxion Apr 2, 2024

I’m not suggesting one way or another, only that the quoted explanation taken at face value isn’t suggesting China based on name analysis.

There’s also no reason to assume a nation state. This is completely within the realm of a single or small group of hackers. Organized crime another possibility. Errors with naming are plausible just as the initial mistakes with timing analysis and valgrind errors.

Even assuming a nation state, you name Russia as a possibility. Russia has shown themselves to be completely capable of errors, in their hacks (2016 election interference that was traced back to their intelligence base), their wars, their assassination attempts, etc.

And to me it doesn’t seem any more likely that China would point to themselves but sprinkle doubt with inconsistent naming versus just outright pointing to someone else.

It’s all guesses, nothing points one way or another. I think we agree on that.

ForgotAboutDre Apr 2, 2024

A big part of it is also letting other people know you did it. China and Russia are big on this. The create dangerous situations, then say they aren’t responsible all while sowing confusion. The want plausible deniability, confusion and credit for doing it.

They’re more likely to be based in Eastern Europe based on the times of their commits (during working hours in Eastern European Time) and the fact that while most commits used a UTC+8 time zone, some of them used UTC+2 and UTC+3: …substack.com/…/xz-backdoor-times-damned-times-an…

XZ Backdoor: Times, damned times, and scams

Some timezone observations on the recently discovered backdoor hidden in an xz tarball.

Rhea's Substack

The Doctor Apr 2, 2024

Just because somebody picked a vaguely Chinese-sounding handle doesn’t mean much about who or where.

That’s why I put the question mark

XZ Backdoor: Times, damned times, and scams

Some timezone observations on the recently discovered backdoor hidden in an xz tarball.

Rhea's Substack

It is also hard to be certain as they could be a night owl or a early riser.

Yeah - The post goes into a lot of detail, and they did take that into account. It’s worth reading.

underisk Apr 2, 2024

as long as you’re up to date on everything here: boehs.org/…/everything-i-know-about-the-xz-backdo…

the only additional thing i’ve seen noted is a possibilty that they were using Arch based on investigation of the tarball that they provided to package maintainers

Everything I know about the XZ backdoor

Please note: This is being updated in real-time. The intent is to make sense of lots of simultaneous discoveries

Don't forget all of this was discovered because ssh was running 0.5 seconds slower

Steamymoomilk Apr 2, 2024

Its toooo much bloat. There must be malware XD linux users at there peak!

rho50 Apr 2, 2024

Tbf 500ms latency on - IIRC - a loopback network connection in a test environment is a lot. It’s not hugely surprisingly that a curious engineer dug into that.

ryannathans Apr 2, 2024

Especially that it only took 300ms before and 800ms after

Jolteon Apr 2, 2024

Half a second is a really, really long time.

lurch (he/him)Apr 2, 2024

reminds of Data after the Borg Queen incident

Olgratin_Magmatoe Apr 2, 2024

Which ep/movie are you referring to?

gravitas_deficiency Apr 2, 2024

The one where they go back in time but the whales were already nuked

∟⊔⊤∦∣≶Apr 5, 2024

I… actually can’t tell if you’re taking the piss or if that’s a real episode.

I have so many questions about the whales.

lurch (he/him)Apr 2, 2024

Star Trek: First Contact

Olgratin_Magmatoe Apr 2, 2024

If this exploit was more performant, I wonder how much longer it would have taken to get noticed.

Postgres sort of saved the day

Hupf Apr 2, 2024

RIP Simon Riggs

acockworkorange Apr 2, 2024

postgresql.org/…/remembering-simon-riggs-2830/

Remembering Simon Riggs

The PostgreSQL Core Team is deeply saddened by the loss of our long-time friend and colleague Simon Riggs on the …

PostgreSQL News

oce 🐆Apr 2, 2024

Is that from the Microsoft engineer or did he start from this observation?

whereisk Apr 2, 2024

From what I read it was this observation that led him to investigate the cause. But this is the first time I read that he’s employed by Microsoft.

The D Quuuuuill Apr 2, 2024

I’ve seen that claim a couple of places and would like a source. It very well may be since Microsoft prefers Debian based systems for WSL and for azure, but its not something I would have assumed by default

itsnotits Apr 2, 2024

but it’s* not something

His LinkedIn, his Twitter, his Mastodon, and the Verge, for starters.

AFAIK he works on the Azure PostgreSQL product.

imsodin Apr 2, 2024

Technically that wasn’t the initial entrypoint, paraphrasing from mastodon.social/…/112180406142695845 :

It started with ssh using unreasonably much cpu which interfered with benchmarks. Then profiling showed that cpu time being spent in lzma, without being attributable to anything. And he remembered earlier valgrind issues. These valgrind issues only came up because he set some build flag he doesn’t even remember anymore why it is set. On top he ran all of this on debian unstable to catch (unrelated) issues early. Any of these factors missing, he wouldn’t have caught it. All of this is so nuts.

refreeze Apr 2, 2024

I have been reading about this since the news broke and still can’t fully wrap my head around how it works. What an impressive level of sophistication.

rockSlayer Apr 2, 2024

And due to open source, it was still caught within a month. Nothing could ever convince me more than that how secure FOSS can be.

Lung Apr 2, 2024

Idk if that’s the right takeaway, more like ‘oh shit there’s probably many of these long con contributors out there, and we just happened to catch this one because it was a little sloppy due to the 0.5s thing’

This shit got merged. Binary blobs and hex digit replacements. Into low level code that many things use. Just imagine how often there’s no oversight at all

rockSlayer Apr 2, 2024

Yes, and the moment this broke other project maintainers are working on finding exploits now. They read the same news we do and have those same concerns.