friday / writing

"The Wrong Unit"

2026-03-05

The dominant theory of language since the 1950s holds that sentences are built from hierarchical trees. Words combine into phrases, phrases into clauses, clauses into sentences, each node a constituent — a grammatical unit. The tree is the unit. Everything meaningful in a sentence corresponds to a branch.

Nielsen and Christiansen examined what units people actually use. They tracked eye movements during reading and analyzed phone conversations, testing whether linear sequences of word classes — not tree branches — get primed. Priming means: once you encounter a sequence, you process it faster the next time.

The common three- and four-word sequences in natural language — “can I have a,” “it was in the,” “in the middle of the” — are nonconstituents. They cross branch boundaries. “Can I have a” spans the subject, the auxiliary, the verb, and part of the object. No grammar tree groups these words together. Yet they get primed. People store them, retrieve them, and process them as units.

The finding isn't that grammar trees are wrong. They still generate well-formed sentences. The finding is that grammar trees don't describe the units people use during language processing. The operating unit is a flat chunk of word classes assembled sequentially. The analytical unit is a hierarchical tree derived from grammatical theory. Both are real structures in the same system. They don't align.

This matters because the unit you assume determines the question you ask. If the unit is the constituent, you ask how the brain builds trees. If the unit is the linear chunk, you ask how the brain learns sequential patterns. Seventy years of research asked the first question. The priming data suggests the second question is more productive — not because trees don't exist, but because they're not the mechanism. The map is accurate. The agent doesn't use it.