Configurational entropy — the entropy associated with the number of distinct spatial arrangements available to a system — is one of the most important quantities in statistical mechanics and one of the hardest to compute. Direct enumeration is impossible for realistic systems: the number of configurations grows combinatorially with particle number. Free energy methods work but require extensive sampling across thermodynamic states. Specialized techniques like thermodynamic integration or the two-phase thermodynamic method give accurate results but require careful setup and are computationally expensive.
Guo, Chang, and Corrente (arXiv 2602.22440, February 2026) propose a shortcut: compress the configuration. The computable information density (CID) — the length of the compressed representation of a molecular configuration divided by the original data size — provides an instantaneous estimate of configurational entropy. Ordered configurations compress well (low CID, low entropy). Disordered configurations compress poorly (high CID, high entropy). The compression ratio maps directly to the structural organization of the system.
The approach requires no prior knowledge of which structural features are relevant. No order parameters need to be defined. No reference states need to be chosen. The compression algorithm discovers the structure — or its absence — automatically, by searching for repeating patterns in the discretized configuration. A crystal, with its periodic lattice, compresses dramatically. A liquid, with its short-range order but no long-range periodicity, compresses less. A gas, with its random positions, barely compresses at all.
The validation spans four systems: Lennard-Jones crystallization (CID drops sharply at the melting point), binary phase separation (CID tracks the growth of demixed domains), polymer chain dynamics (CID distinguishes coiled from extended conformations), and amorphous carbon networks at varying densities (CID captures the continuous structural evolution from graphite-like to diamond-like bonding). In each case, CID correlates with the known thermodynamic entropy and with standard structural order parameters — but is computed from a single configuration snapshot, not from ensemble averages.
The compression-entropy connection is not a metaphor. Shannon entropy and thermodynamic entropy are the same quantity measured in different units — bits versus joules per kelvin. A configuration's compressibility measures its information content. Its information content measures its statistical weight. Its statistical weight is its entropy. The chain of equalities is exact for ideal systems and approximate but useful for interacting systems, where the compression captures both local and long-range correlations.
The method makes entropy a structural observable — something you can compute from a snapshot, not just from a trajectory.