Qur’an Memorizers Who Don’t Speak Arabic Learn Grammar from Statistics Alone

How do we learn what a word is? How do we come to know the difference between words, and how they fit into the categories of grammar? Some of that knowledge comes from explicit instruction. The people around us point out things as we see them (“Doggie! Kitty!”), or name actions as we perform them (“Walk to Mama!”), but most of our learning isn’t so explicit. One theory suggests that we absorb grammar from the statistical probabilities in the speech around us, that over time, we learn what goes with what, or what can be swapped out for what, by simply hearing enough of it that we deduce the pattern.

Of course, that statistical learning doesn’t happen in a vacuum. We are also living and interacting in the world, where words have meaning. So how much learning is connected to meaning and how much comes from pattern recognition? Researchers have tried to separate the two processes in the lab by creating artificial grammars with made-up words, or even tones, and seeing how much of the pattern people can deduce without explicit instruction or any connection to meaning.

The answer is: quite a bit. People, even as babies, are good at pulling out grammatical structure from patterned data. But the artificial learning experiments are necessarily small and limited, so it’s unclear how much they can tell us about language learning in the real world.

As it turns out, there has been a large-scale natural test of statistical learning out there all along in the practice of Qur’anic memorization. There are Muslims all over the world who do not speak Arabic (in Indonesia, Pakistan, and Turkey, for example), but who as part of religious practice memorize the Qur’an for recitation, often starting as children and continuing memorization training for years. That training is often unaccompanied by any explicit Arabic instruction or direct translation of the memorized text. They get the statistics of the pattern without the meaning.

A recent paper in Cognition by Fathima Manaar Zuhurudeen and Yi Ting Huang takes advantage of this “natural experiment” to test whether simple exposure to the patterned properties of the Classical Arabic in the Qur’an results in implicit grammatical knowledge. They compared four groups: memorizers who also had classroom Arabic lessons, memorizers with no classroom exposure, non-memorizers with classroom exposure, and a group with no Arabic exposure of any kind.

The groups that had classroom experience had explicitly learned things like what the first person pronoun “I” looks like and how it attaches to verbs, or what the second person possessive pronoun “your” like looks like and how it attaches to nouns. The group without classroom experience but with memorization training had never had these things explained. Had they absorbed the rules of how they worked simply by hearing and repeating them in memorized text?

Yes. The memorizers without classroom Arabic did better than any of the other groups at demonstrating knowledge of the rules. This knowledge was not explicit; they could not explain how pronouns, verbs, and nouns worked, but they could judge whether a sentence they had not heard before was correct or not with accuracy.

Surprisingly, the memorizers with no classroom Arabic did better than those who had had lessons, suggesting that a “top-down approach” that explains the rules of language “may negatively impact learners’ sensitivity to the bottom-up statistics of a language.” Does that mean it’s time to ditch language classes altogether and just start memorizing? Not quite. The groups with no classroom lessons had a good subconscious grasp of Arabic grammar principles, but could not speak or understand Arabic. Still, the study shows we can absorb sophisticated patterns from language exposure without really knowing what we’re hearing. So go ahead and put on that Spanish radio station or memorize some Chinese poetry. It can’t hurt, and it just might help. In fact, it probably will.