Reflections on the xz backdoor

The actual bug I planted in the compiler would match code in the UNIX “login” command. The placement code would miscompile the login command so that it would accept either the intended encrypted password or a particular known password. Thus if this code were installed in binary and the binary were used to compile the login command, I could log into that system as any user.
- Ken Thompson, Reflections on Trusting Trust (1984)

The xz backdoor is the hottest CVE of the decade. There were so many crazy social engineering and technical tricks that are frankly above my level to try and explain them all. The attacker was clearly skilled as evidenced by their work: creeping updates that modified the test system and build scripts gradually, concealing a payload within binary data for seemingly innocuous test cases, developing an activation system that was robust enough to future-proof the backdoor. But the attacker was still not careful enough to prevent a performance regression from affecting sshd, and Andres Freund was curious enough to chase down 500 milliseconds of latency.

We were lucky Freund dove down this rabbit hole and emerged with the stunning discovery of a major vulnerability so quickly. But this discovery of the xz backdoor was by no means assured. Open source allows everybody to view code, it doesn’t guarantee that anybody actually does. In this specific instance, the xz backdoor wasn’t even present in the code itself. It took a very clever inference to connect a sshd slowdown with some seemingly unrelated Valgrind complaints to catch onto the beginning of something sinister going on in xz. To be technically shrewd enough to understand this, care enough to investigate, and willing to share with others is a rare combination limited to a tiny portion of the open-source community. Yes, open-source triumphed this time, but what about the next time?

Open source development is currently a high-trust environment, and I fear that suspicion about this case is going to color all sorts of future interactions.
- Benny Siegert, The XZ Backdoor (2024)

Siegert mentions that NetBSD requires that a new contributor must have their PGP key signed by an existing NetBSD developer at an-inperson meetup. However, the challenge lies in the fact that tail-end supply chain attacks are highly patient and sophisticated. In this particular attack, Jia Tan and the other sock-puppet accounts lacked any online history or record. But there is widespread suspicion that these various accounts were pseudonyms used by a group of malicious actors backed by a nation-state. If this is indeed true, it is likely that a well-prepared attacker in the future will find ways to adopt, fabricate, or compromise a credible identity. Requiring that developers use real identities would deter some attackers, but not the most determined.

Moreover, the average open source project probably only has a single contributor. Few projects, such as the Linux kernel, have the security qualities that a large number of competent hackers can provide when scrutinizing code that gets merged. If the original software author sabotages their own code, then we are left with few solutions.

There is no straightforward solution to these issues because they stems from a lack of trust rather than a technical flaw. Microsoft or Google could some day develop some AI product that scans open source GitHub repositories for malicious code, but these will ultimately be mere warning signs and not permanent remedies.

Regardless of whether you consider it within the domains of software development, banking, or the law: trust is an unsolved problem. This recent xz backdoor is a reminder of that.