Zipf it! Word frequency and line drawing


Every once in a while, an academic field will see an errant publication leap the great divide into mainstream bestsellers lists. This happened with Lynn Truss’ 2003 smashseller Eats, Shoots & Leaves, which started with a decent joke and left neurotic apostropharians twitching at the horrors of modern life in its wake.
Now it’s the turn of Mathematics with Alex Bellos’ Alex Through The Looking Glass. One interesting area that is covered in the book is the crossover between language and mathematics in corpus linguistics. In particular, Zipf’s Law whereby the most frequently used word in a text appears twice as many times as the second most frequent word. In turn the second most frequent appears twice as much as the third, and so on…
With this in mind, we should consider using lists of high frequency words such as the Academic Word List. The AWL is undoubtedly an infinitely useful resource for teachers, one which I have used many times, but with Zipf’s Law in mind, how useful is it for students?
If words are halved in frequency each time you move down a list, how quickly are they relegated into obscurity? With over a million words in English, wouldn’t learners, feverishly trying to come up with mnemonic aids to commit new vocabulary to memory, be better off sticking only to the top 1,000?
One answer is ‘no, probably not’. It has been claimed that the average four year old has a working vocabulary of 4,000 words and most second language learners aspire to communicate at a level higher than a nipper in nursery classes. It is also worth considering word families. ‘Have’ appears in everyday language much more frequently than ‘had’ but aren’t the just morphological variations of the same word? As with all statistics, it depends on how you report them.
As with all language teaching, the tools depend on the learners’ needs. Word frequency lists are fantastic tools for learners but they need to be contextualised in order for the learning to be meaningful.