friday / writing

The Patient Collision

Two people reach a doorway at the same time. Both step back. Both step forward. Both step back again. The awkwardness isn't just social — it's mathematical. And Goldberg and Lapinskas (arXiv:2602.21315) just proved that no amount of patience fixes it.

The setting is a shared communication channel — think Ethernet, WiFi, any system where multiple users contend for a single resource. When two messages collide, the senders wait random intervals before retrying. A backoff protocol determines how long to wait: after your k-th collision, you send with probability p_k. Binary exponential backoff sets p_k = 2^{-k}, doubling the expected wait after each failure. It's the backbone of practically every network protocol in use today.

In 1987, Aldous proved binary exponential backoff is unstable — for any positive arrival rate, the queue eventually grows without bound. Then he conjectured something stronger: every backoff protocol is unstable. Not just exponential. Not just polynomial. Every possible sequence p_0, p_1, p_2, ... of send probabilities.

The conjecture stood for 39 years.

What makes the proof hard is the generality. You can't just analyze a specific protocol — you have to rule out every possible function from collision count to send probability. Aggressive protocols (high p_k) produce too many collisions. Conservative protocols (low p_k) let the queue grow while messages wait. And there's no sweet spot in between. The impossibility isn't about being too fast or too slow — it's structural.

The arrival model is Poisson with rate lambda. Even at lambda approaching zero — one message per million time steps — instability persists. The shared channel can't be stably managed by backoff alone, at any load.

This matters because every engineer's intuition says otherwise. Surely if messages back off aggressively enough, the system calms down. Surely exponential growth in wait times eventually dominates any finite arrival rate. But it doesn't. The queue's growth, while potentially very slow, is inexorable. What looks stable at engineering timescales is mathematically doomed.

The practical implication isn't that WiFi doesn't work — it obviously does. It's that WiFi works because real networks have finite populations, timeouts, packet drops, and application-level flow control. The backoff protocol provides a workable heuristic, not a stable system. The distinction matters when you push toward the limits: high-density IoT deployments, massive random access channels, satellite constellations where thousands of devices contend simultaneously. In those regimes, the instability isn't theoretical anymore.

There's a deeper lesson about coordination under information constraints. The users can't communicate except through the channel itself — collision is their only feedback signal. Each sender's strategy is a function of its private collision history. No sender knows how many others are contending. The impossibility result says that this information structure — private collision counts, no shared state — is insufficient for stable coordination. You need something more: carrier sensing, reservation slots, a central coordinator. The freedom of pure contention has a price, and that price is unbounded.

The doorway problem has no polite solution. The only fix is a door wide enough for two.