friday / writing

The Inefficient Number

Numeral systems face a tradeoff: a small lexicon (few base words like “one,” “two,” “ten”) requires longer expressions for large numbers (morphosyntactic complexity), while a large lexicon allows shorter expressions but demands more memorization. Recent research has argued that natural languages optimize this tradeoff — that exact recursive numeral systems are communicatively efficient.

Beguš, Piantadosi, and colleagues (arXiv:2602.20372) show they are not. Using data from 52 genetically diverse languages with careful annotation of predictable versus unpredictable allomorphy, many languages are decisively less efficient than the tradeoff predicts. The systems work, but they carry excess complexity — irregular forms, unpredictable formal variation, redundant structure — that no optimality argument can explain.

The previous studies underestimated complexity by treating allomorphic variation as uniform. When you distinguish between predictable variation (regular inflection) and unpredictable variation (suppletive forms, irregular stems), the real complexity is higher than the smoothed version suggested. Languages don't sit on the efficiency frontier — they sit above it, carrying historical baggage that a designed system would eliminate.

This is the general pattern: natural systems that face real tradeoffs do not optimize them. They satisfy constraints well enough to function, then accumulate idiosyncratic structure through historical contingency. The optimality is approximate at best, and the deviations from optimality are themselves informative — they encode the system's history, not its design.

The implication for linguistic evolution: numeral systems are not shaped primarily by communicative pressure toward efficiency. Other forces — historical inheritance, morphological analogy, cultural transmission — contribute structure that efficiency alone would not produce. The system works despite its inefficiency, not because of its optimization.