Skip to main content

Zipf's Law

Zipf's Law



What if I told you that just by using a simple formula, I can calculate the number of times any word comes in this article, or in a book, or even across the entire internet…?
Zipf’s Law allows you to do exactly that with math that even a second grader can understand.

The law states that “Given some corpus of natural language utterances, the frequency of any word is inversely proportional to its rank in the frequency table.”
Now what this essentially means is any word which is the nth most common word will occur x times where

Formula

X= Number of times the most common word is used
                                             N

This extremely overpowered. This is mainly because in any language, English, German, French and even the ones which we have not been able to decipher yet follow a peculiar system.

The most used word is used almost two times more than the second most used word, three times more than the third most common word, four times as common as the fourth most common word and so on and so forth.

In the case of English, “The” happens to be the most common word, “of” comes in the second place, “and” in the third.


Now “the” accounts for about 7% of the total words.
So going by the above formula, “of” would account for about 7/2% = 3.5% of the total words and this actually holds true!

“And” accounts for about 7/3% =2.3% of the total words and this also comes out to be correct.

Numerically, “the” comes 69,971 times, “of” comes 36,411 times, “and” comes 28,852 times. These are out of a net total of about million words.


Go ahead do the math, verify it for yourself!
(All the above readings are from the Brown Corpus of American English text)


Now comes the really fun part,

 GRAPHS!


The graph below shows the number of occurrences of the 7 most popular words in the Brown Corpus.

Notice the curve!



Also for anybody who knows log, it gets even more interesting

The graph below is a log-log chart.

It depicts practically the same thing just with a log graph.




Moreover, this law does not only limit itself to languages, it is even true for
  • Page hits to Popularity rank of the web page
  • Population ranks of cities to City size
  •  Corporation Sizes to corporation ranking
  •  Income ranking to net income
  •  Popularity of TV channel to its TRP


It is like this law is everywhere as if it has been built into the human brains and psychology.

The law is among the most mysterious and there is no specific reason as to why this happens. There are a bunch of theories all over the internet but none do justice to the law and hence have not been included in this article.

This law does not end here, just as any other principle or law in math it is connected to many others and some of the ones are

1.       Benford’s LawPareto principle

All of those will be covered in the posts that follow. Zipf’s law does not end here…
PS- For the above content:
 “The” comes 48 times out of 527 words (approx. 9%)
  “Of” comes 24 times (Exactly Half!= 4.5%)
 “And” comes 15 times (approx. 1/3 = 3%)

While the sample size is very small the law still holds true.
Need I justify this law any further?


Enjoy your high school with - High School Pedia : www.highschoolpedia.com



Comments

Popular Posts

Top 5 Best PC Games

Hi, this is highschoolpedia, and today we are going to show you our list of top 5 PC games that you must play. So let's get started... 1) FIFA 17 FIFA 17 is the latest  football video game in the FIFA series. It released on 27 September 2016 in North America and globally. FIFA 17 will be the first FIFA game in the series to use the Frostbite game engine. It is the first FIFA game to have a story mode. It is available in 2 versions.

Top 5 Upcoming Cars In 2017

Hi guys!!!!! So in this post, we will talk about the upcoming cars of 2017... New Maruti Swift The new gen Swift will also debut in our market next year. This car will be bigger in size and available with a powerful set of engines. The Swift Sport with the 1.6L engine could also be on offer keeping the changing trend of the market in India.

Animal and Plant Cells

 Cells Cells are the basic functional, biological and structural unit of life. The word cell is a Latin word meaning ‘small room’. Cells are also known as building blocks of life.  The branch of science that deals with the form, structure, and composition of a cell is known as Cytology. All organisms around us are made up of cells. Bacteria, ameba, paramecium, algae, fungi, plants and animals are made up of cells.  Cells together form tissues. And tissue together makes an organ. History Of Cell The cell was discovered by Robert Hooke in 1665. He assembled a simple microscope and observed a very thin slice of cork under his primitive microscope. The cork was obtained from the outer covering of a tree called bark. Robert Hooke observed many little-partitioned boxes or compartments in the cork slice. These boxes appeared like a honey-comb. He termed these boxes as the cell. He also noticed that one box was separated from another by a wall. What Ho

Rutherford Alpha Ray Scattering Experiment

Rutherford Alpha Ray Scattering Experiment Hey, Guys, most of you might have heard about the alpha ray scattering experiment and if you want to know in detail about Rutherford's model and the experiment he conducted, this is the right place for you... But first: Things You Must Know Some basic information that will help you understand rutherford experiment properly: Proton is a sub-atomic particle which is positively charged and has a mass of 1u. Alpha particles are helium atom with a charge of +2 as they have lost 2 electrons. Alpha particles have an atomic mass  of 4u. Gold is highly malleable and can be beaten into very thin sheets. Experiment Rutherford conducted his experiment in the following way: Rutherford took a very thin gold foil and bombarded it with high energy alpha particles. He placed a layer of zinc sulphide on the walls where the experiment was taking place because when alpha particles strike zinc sulphide layer, it results i

Isotopes, Isobars and Isotones

Isotopes These are elements which have the same atomic number but different atomic mass . They have the same atomic number because the number of protons that are inside their nuclei remains the same. But, they have different atomic mass because the number of neutrons that are also inside their nuclei is different. As the number of protons inside nuclei remains same, therefore the overall charge of the elements also remains same as in isotopes: no of protons = no of electrons . Hence, as isotopes overall charge remains neutral, therefore their chemical properties will also remain identical.   Therefore, Isotopes are chemically same but physically different.

Cathode Ray Experiment

This experiment was conducted by J.J. Thomson (Sir Joseph John Thomson) in the year 1897. This experiment proved that atom is made up of fundamental particles which are much smaller than the smallest atom 'hydrogen' This experiment helped to discover electron. According to J.J. Thomson, the cathode rays consisted of very light, small and negatively charged particles. He named the particles "corpuscles" which were later known as electrons

Top 5 Highest Paid Actors in The World

Wanna know who are the top 5 highest paid actors in the world? You have come to the right place !! 1) LEONARDO DICAPRIO  ( $77 MILLION / MOVIE)