Zipf’s law

What are the hallmarks of humanity? What trait exemplifies the raw creativity and ingenuity of human existence? Language comes to mind as an obvious answer. The ability to produce strings of complex, meaningful information with nothing but our vocal cords is astonishing. Languages across the world showcase human’s natural gift; there may be nothing quite as organic as the dilations of speech.

But is it really all that pure? Is language truly free flowing and deregulated? An obscure mathematical principle conjured in the years after World War 2 states otherwise. Zipf’s Law was formulated in 1949 by the Linguist George Zipf. He realized after studying word frequencies from languages across the globe that a select few terms were used disproportionately more often than others.

This hardly seems surprising, considering we’re all aware of the simple fact that the word “the” will be said or written far greater than the word, “atrocious.” In fact, in this article I’ve already used the designation “the” 10 times out of a scant 170 word paper.

"Moby Dick" is a great example of Zipf's Law of word frequency. Graphic from Search Engine Land
“Moby Dick” is a great example of Zipf’s Law of word frequency. Graphic from Search Engine Land

In the English language, “the” is the most uttered and inscribed term of all. However, that’s not the remarkable aspect of Zipf’s Law. When you plot the frequencies of the most commonly used words in any language, both modern and ancient, you’re faced with a near perfect Power Law. The most used word will be used twice as often as the second most popular term, three times more than the third term, and so on down the line indefinitely.

The implications of Zipf’s Law are immense; somehow humans very speech patterns, a trait thought to be so inspired and organic that no cold logic could touch them, are in fact rooted in the heart of mathematical theory. Only confusing the situation even more is that the cause of this Power Law is unknown.

No one fully understands why all humans verbal and written communications since the dawn of the Broca area have followed the smooth slope of Zipf’s Law, slipping down by halves, thirds, quarters, and fifths until it rests in a long nadir of obscure vocabulary.

In a day and age where there appears to be no more mysteries, and the elusive has become obvious, it’s a rare occurrence to stumble upon a problem with no answer.

Some of the brightest mind of the past 75 years have been hacking away at enigmas such as Zipf’s Law to no avail; but maybe that is the rightful emblem of humanity– persistence in the face of defeat and the undaunted curiosity of the human mind.