How a Problem About Pigeons Powers Complexity Theory

Brady Clegg via Shutterstock
Introduction
They say a bird in the hand is worth two in the bush, but for computer scientists, two birds in a hole are better still. That’s because those cohabiting birds are the protagonists of a deceptively simple mathematical theorem called the pigeonhole principle. It’s easy to sum up in one short sentence: If six pigeons nestle into five pigeonholes, at least two of them must share a hole. That’s it — that’s the whole thing.
“The pigeonhole principle is a theorem that elicits a smile,” said Christos Papadimitriou, a theoretical computer scientist at Columbia University. “It’s a fantastic conversation piece.”
But the pigeonhole principle isn’t just for the birds. Even though it sounds painfully straightforward, it’s become a powerful tool for researchers engaged in the central project of theoretical computer science: mapping the hidden connections between different problems.
The pigeonhole principle applies to any situation where items are assigned to categories, and the items outnumber the categories. For example, it implies that in a packed football stadium with 30,000 seats, some attendees must have the same four-digit password, or PIN, for their bank cards. Here the pigeons are football fans, and the holes are the 10,000 distinct possible PINs, 0000 through 9999. That’s fewer possibilities than the total number of people, so some people must have the same digits.
This proof is notable not just for its simplicity, but also for what it leaves out. Many mathematical methods for proving that something exists are “constructive,” meaning they also show you how to find it. “Nonconstructive” proofs, like ones based on the pigeonhole principle, don’t have this property. In the football stadium example, knowing that some people must have the same PINs won’t tell you what they are. You can always go through the stands asking each person in turn. But is there a simpler way?
Questions like this one, about the most efficient way to solve problems, are at the heart of the branch of computer science known as computational complexity theory. Complexity theorists study such questions by lumping problems into classes based on certain shared properties. Sometimes the first step toward a breakthrough is simply defining a new class to unite problems that researchers hadn’t previously studied together.
That’s what happened in the 1990s, when Papadimitriou and other complexity theorists began to study new classes of problems, in which the goal is to find something that must exist because of the pigeonhole principle or another nonconstructive proof. That line of work has led to important progress in disparate fields of computer science, from cryptography to algorithmic game theory.

Christos Papadimitriou (inset) and Oliver Korten showed that the empty-pigeonhole principle connects to important problems in math and computer science.
Columbia Engineering. Inset courtesy of Christos Papadimitriou
By January 2020, Papadimitriou had been thinking about the pigeonhole principle for 30 years. So he was surprised when a playful conversation with a frequent collaborator led them to a simple twist on the principle that they’d never considered: What if there are fewer pigeons than holes? In that case, any arrangement of pigeons must leave some empty holes. Again, it seems obvious. But does inverting the pigeonhole principle have any interesting mathematical consequences?
It may sound as though this “empty-pigeonhole” principle is just the original one by another name. But it’s not, and its subtly different character has made it a new and fruitful tool for classifying computational problems.
To understand the empty-pigeonhole principle, let’s go back to the bank-card example, transposed from a football stadium to a concert hall with 3,000 seats — a smaller number than the total possible four-digit PINs. The empty-pigeonhole principle dictates that some possible PINs aren’t represented at all. If you want to find one of these missing PINs, though, there doesn’t seem to be any better way than simply asking each person their PIN. So far, the empty-pigeonhole principle is just like its more famous counterpart.
The difference lies in the difficulty of checking solutions. Imagine that someone says they’ve found two specific people in the football stadium who have the same PIN. In this case, corresponding to the original pigeonhole scenario, there’s a simple way to verify that claim: Just check with the two people in question. But in the concert hall case, imagine that someone asserts that no person has a PIN of 5926. Here, it’s impossible to verify without asking everyone in the audience what their PIN is. That makes the empty-pigeonhole principle much more vexing for complexity theorists.
Two months after Papadimitriou began thinking about the empty-pigeonhole principle, he brought it up in a conversation with a prospective graduate student. He remembers it vividly, because it turned out to be his last in-person conversation with anyone before the Covid-19 lockdowns. Cooped up at home over the following months, he wrestled with the problem’s implications for complexity theory. Eventually he and his colleagues published a paper about search problems that are guaranteed to have solutions because of the empty-pigeonhole principle. They were especially interested in problems where pigeonholes are abundant — that is, where they far outnumber pigeons. In keeping with a tradition of unwieldy acronyms in complexity theory, they dubbed this class of problems APEPP, for “abundant polynomial empty-pigeonhole principle.”
One of the problems in this class was inspired by a famous 70-year-old proof by the pioneering computer scientist Claude Shannon. Shannon proved that most computational problems must be inherently hard to solve, using an argument that relied on the empty-pigeonhole principle (though he didn’t call it that). Yet for decades, computer scientists have tried and failed to prove that specific problems are truly hard. Like missing bank-card PINs, hard problems must be out there, even if we can’t identify them.
Historically, researchers haven’t thought about the process of looking for hard problems as a search problem that could itself be analyzed mathematically. Papadimitriou’s approach, which grouped that process with other search problems connected to the empty-pigeonhole principle, had a self-referential flavor characteristic of much recent work in complexity theory — it offered a new way to reason about the difficulty of proving computational difficulty.
“You’re analyzing the task of complexity theory using complexity theory,” said Oliver Korten, a researcher at Columbia.
Korten was the prospective student with whom Papadimitriou had discussed the empty-pigeonhole principle right before the pandemic. He came to Columbia to work with Papadimitriou, and in his first year of grad school he proved that the search for hard computational problems was intimately linked to all other problems in APEPP. In a specific mathematical sense, any progress on this one problem will automatically translate into progress on a host of others that computer scientists and mathematicians have long studied, such as the search for networks that lack simple substructure.
Korten’s paper immediately attracted attention from other researchers. “I was quite surprised when I saw it,” said Rahul Santhanam, a complexity theorist at the University of Oxford. “It’s incredibly exciting.” He and others have since built on Korten’s breakthrough to prove a flurry of new results about connections between computational difficulty and randomness.
“There is amazing richness to this,” Papadimitriou said. “It goes to the bone of important problems in complexity.”