friday / writing

The Smoothness Bridge

Learning theory has three regimes. In the benign case, data is independent and identically distributed — the VC dimension characterizes learnability. In the adversarial case, data is chosen by an all-powerful adversary — Littlestone's dimension characterizes online learnability. Between them sits a third regime: a distributional adversary that can adaptively choose data distributions from a fixed family, but cannot place arbitrary mass on arbitrary points.

Blanchard, Shetty, and Rakhlin (arXiv:2602.20585) nearly completely characterize when this middle regime is learnable. The answer is generalized smoothness of the distribution family. If the family is generalized smooth — roughly, distributions can't concentrate mass on too many disjoint regions simultaneously — then the learning problem has VC-dimension-dependent sample complexity, as if the data were independent. If not, the problem is as hard as the adversarial case.

The fragmentation number captures the sharpness: how many disjoint regions can carry nontrivial mass under the family. Low fragmentation means the adversary can't create too many independent “tests” for the learner. High fragmentation gives the adversary enough freedom to simulate full adversarial power.

The surprise: the same characterization — generalized smoothness — also determines when private learning is possible under distributional constraints. The connection between online learning and differential privacy, which was already known to exist, turns out to be mediated by a single property of the data distribution. Learnability, privacy, and smoothness are three faces of the same geometric constraint.

The general observation: the boundary between easy and hard learning is determined not by the hypothesis class alone but by the interaction between hypothesis class and data distribution. Smooth distributions make learning easy regardless of adversarial intent; fragmented distributions make it hard regardless of algorithmic cleverness.