Illustrative derivation trees for (a) simple English sentence, and (b) RNA secondary structure (after [6]). The latter is a derivation of the sequence ‘gacuaagcugaguc’ and shows its folded structure. Terminal symbols are encircled. Credit: arXiv:1809.01201 [cond-mat.dis-nn]

Eric DeGiuli, a physicist at École Normale Supérieure, has proposed that a human language grammar can be viewed as if it were a physical object, allowing theories such as those in statistical mechanics to explain how a child learns a language. In his paper published in the journal Physical Review Letters, he describes his ideas and his hopes that they might one day be associated with neurological evidence.

Most parents notice that children learn a in a standard sort of way—they pick up words as labels for things and then one day, start assembling words they have learned into sentences. Linguists have noticed that the changeover from speaking words to speaking sentences is usually quite abrupt, making many in the brain science field wonder what actually happens. In this new effort, DeGiuli proposes a to explain the process and uses physics theories to do it.

DeGiuli starts out by suggesting that a context-free grammar (CFG), which covers most human languages, can be viewed as a physical object, conceiving it in a more physical way, such as must be the case inside the heads of people who are able to speak a language. He further proposes that a CFG can be modeled like a physical tree (not just a virtual one such as those typically used to describe CFGs)—with surfaces representing sentences that include all the words a person knows, whether they make sense or not. He then suggests that as a child hears new words and processes them, they begin to build grammar rules in their brain, some of which are deeper than others.

The deepness of the rules is assigned as the brain assigns weights to different rules—those with more weight are deemed more likely to lead to sentences that make sense. It is at this point that DeGiuli introduces theories into his proposal to explain how the weighting process works. He believes it is possible that the uses two major factors to decide how to prune branches in the tree: how much a given weighting results in depth assignments within a tree, and how much they do so to arrive at surface-level assignments. In the end, he notes, it would be the sparseness of the tree that defines the level of usability of the tree to form sentences. When it reaches a certain point, the tree suddenly becomes usable and the child begins spouting complete sentences at his or her parents.

More information: E. DeGiuli. Random Language Model, Physical Review Letters (2019). DOI: 10.1103/PhysRevLett.122.128301 , On Arxiv: arxiv.org/abs/1809.01201

Journal information: Physical Review Letters