So, kids, what's the moral of the XZ story?

If you're going to backdoor something, make sure that your changes don't impact its performance. Nobody cares about security - but if your backdoor makes the thing half a second slower, some nerd is going to dig it up.

@bontchev

I'm face-palming that we didn't dig into the weird valgrind errors more. In easy hindsight everything around 5.6.1 should have set off alarm bells.

@mattdm @bontchev You're right, but you're talking with hindsight.
The question is not how you would react with your knowledge right now, the question is did things look weird with the knowledge you had right then and there.

Were the valgrind errors weird? Sure they were.

But there was a responsive upstream, the code was in a "weird" place such as ifunc and compiler optimizations were seemingly involved etc.
The maintainer is responsive, on top of things etc.

The insidious thing is, this is exactly how you need the open source community to react for the whole thing to work.
If we are super suspicious about every contribution, need four people to sign off on every commit etc., would we really have that landscape of amazing open source stuff we have?

I think to assume good intentions is important and should not be given up, even with the experience of being betrayed by a bad actor. Imagine how bad the original maintainer of xz must feel in that situation.

Instead of beating us up over not seeing things that were not obvious, we should approach this better:

Were there clear signs that we missed?

Where there things that gave us a hunch of what was going on? Why did we ignore them?

And on the technical side, I guess we could invest some effort in validating that the tarballs we use to build our source is actually corresponding to the source code we're using.
Fedora already does a lot of things right, e.g. we're not using the shipped configure but we're running aclocal and all the other stuff ourselves.
Improving on that is probably going to get us further than starting to question contributors or contributions. We might not just hit a few backdoor attempts but we might also hit a few accidential maintainer fuckups. And that would be a good benefit for the distro in every case.

@ixs @mattdm @bontchev

Seems upstream maintainers should even more follow the guidance I give to my teams...

If you don’t understand a change; even in a dependency, don’t merge it.

Trouble is, most just assume their lib dependencies are well tested by upstream before release, and this should theoretically be sufficient.

Heuristic testing by measuring the call stack depth and process tree depth might help.

@systemalias @ixs

the "upstream maintainer" thing wouldn't have helped here, though....

@mattdm @ixs

The distro layer of testing needs such heuristics in test.

@systemalias @ixs

What would they look like? When would you run them? How would you prevent so many falsely positives that they're ignored?

@mattdm @systemalias @ixs
Those are good questions. The year is 2024. Much of the world relies on these systems. It would irresponsible of an industry to only be asking these questions now. Feels similar to when news pundits feign shock to cover for their complicity. If the questions come from genuine ignorance, that's also an indictment of our industry, because I know Cassandras have been warning about software supply chain for years. The side of the status quo needs to explain themselves to the tech critics, not the other way around.

@toolbear @mattdm @systemalias What and who is forming that industry you are talking about?

If you are talking about the majority of open source developers, they are in fact not an industry.
The exact opposite.

Could you please elaborate who that industry is and who exactly is failing how?
Cause the way I see it (and to reuse a phrase) “the industry” is currently dumpster diving for code and then complains that their supply chain is not doing their job for them.

@ixs @bontchev @mattdm Or really, we could just deprecate the use of tarballs in the majority of circumstances, especially for projects that *do* have repos instead of exclusively releasing tarballs.

Also re-running autoreconf locally after deleting the bundled files (if any), since no one reasonably audits the output of autotools.
@bontchev True. Non-functional requirements are so important, though often undervalued. I'm sure the person responsible will have to talk about it in sprint retrospective!
@bontchev @Di4na I was thinking more of “be a bit more patient and have the payload dormant until 2025”
@bontchev "Bingo!" - me, a perf guy
@bontchev "I watched my ssh daemon taking 4 KiB more memory than usual, I knew something was wrong".
I remember back in the days of Linux 2.2 when swapping out inactive pages during memory pressure meant that it was possible for a process to show up in top with a resident size of 0KB. And top mistakenly showed any such process as a kernel thread.

@bontchev

The moral of the story is probably that open source projects need to defend against government-sponsored hacks. The big corp softwares are already hacked and filled with backdoors.

Open source is becoming more popular by the day. They need to discredit it or introduce vulnerabilities somehow.

Signed,

Spooky Mulder

@bontchev Also remember to check all your PRs for stray dots in cmake files I guess.
@bontchev Moral of the story: use OpenBSD if at all possible
Now that I have read more about the story I realize that your description is surprisingly accurate.
@bontchev Can I quote you on that?

@bontchev @KrebsOnSecurity_rss

Also, make all your network traffic async and make sure you fail quietly if a resource is offline.

@bontchev Another lesson: Don't accept PRs that include binary blobs.
@bontchev all DRM makes everything slow. Conclusion: DRM is malware and has to be removed.
@bontchev What if your changes improve performance?
@geert @bontchev What if your backdoor improves performance? Well... that happened in about Pentium Pro era. And we still don't know what to do with that. Its called "caches" and "speculation" and it leads to very hard to fix information leaks, aka Spectre and Meltdown (and others yet to be discovered) :-(. Too bad it probably was not intentional backdoor.
@bontchev so the ideal backdoor would be a patch that actually IMPROVES the performance after the update. So the best time to sneak a couple of "sleep" calls into components is NOW.