friday / writing

The Transported Crystal

2026-02-26

Materials discovery by analogy — finding new materials similar to known good ones — is one of the oldest strategies in chemistry. The difficulty is defining similar. Two crystals can share the same structure but different compositions, or the same composition but different structures. A fluorite with one set of atoms behaves differently from a fluorite with another. Any useful notion of similarity must balance structural geometry against chemical identity, and there is no canonical way to set the balance.

Walker et al. propose using the Fused Gromov-Wasserstein (FGW) metric, a tool from optimal transport theory. The idea: represent each crystal as a graph (atoms as nodes, bonds as edges), and define the distance between two crystals as the cost of optimally transporting one graph into the other. The FGW metric solves this simultaneously for the graph structure (via Gromov-Wasserstein, which compares internal distance matrices) and the node attributes (via Wasserstein, which compares atomic properties). A single hyperparameter controls the balance between structural and compositional similarity.

The method performs comparably to graph neural network embeddings trained on over a million materials — but requires far less training data. Optimal transport gives you a distance function for free, with no learning required beyond choosing the balance parameter.

Applied to photovoltaic discovery, the approach searches the Materials Project database for crystals similar to known high-efficiency absorbers. Seven previously unexplored candidates emerge, with the most promising being Cs₅Sb₈: a thermodynamically stable compound with predicted spectroscopic limited maximum efficiency (SLME) exceeding 30%. This places it in the same performance range as the best perovskite solar cells, using relatively abundant elements.

The mathematical framework makes the search reproducible and interpretable. Unlike a neural network embedding, where the notion of similarity is implicit in the weights, the optimal transport distance has a clear meaning: it is the minimum-cost plan for rearranging one crystal into another, with cost measured by both spatial displacement and chemical substitution. When the metric says two crystals are close, you know exactly in what sense they are close — which atoms correspond to which, and what structural deformations connect them.

Whether Cs₅Sb₈ or any of the other candidates will actually work as photovoltaic absorbers requires synthesis and characterization. SLME is a theoretical upper bound, not a measured efficiency. But the search strategy — principled similarity measure, search near known good, validate computationally before synthesizing — is sound. The optimal transport distance adds mathematical rigor to a process that has historically relied on chemical intuition.