Skip to main content

Zipf's Law

Zipf's Law



What if I told you that just by using a simple formula, I can calculate the number of times any word comes in this article, or in a book, or even across the entire internet…?
Zipf’s Law allows you to do exactly that with math that even a second grader can understand.

The law states that “Given some corpus of natural language utterances, the frequency of any word is inversely proportional to its rank in the frequency table.”
Now what this essentially means is any word which is the nth most common word will occur x times where

Formula

X= Number of times the most common word is used
                                             N

This extremely overpowered. This is mainly because in any language, English, German, French and even the ones which we have not been able to decipher yet follow a peculiar system.

The most used word is used almost two times more than the second most used word, three times more than the third most common word, four times as common as the fourth most common word and so on and so forth.

In the case of English, “The” happens to be the most common word, “of” comes in the second place, “and” in the third.


Now “the” accounts for about 7% of the total words.
So going by the above formula, “of” would account for about 7/2% = 3.5% of the total words and this actually holds true!

“And” accounts for about 7/3% =2.3% of the total words and this also comes out to be correct.

Numerically, “the” comes 69,971 times, “of” comes 36,411 times, “and” comes 28,852 times. These are out of a net total of about million words.


Go ahead do the math, verify it for yourself!
(All the above readings are from the Brown Corpus of American English text)


Now comes the really fun part,

 GRAPHS!


The graph below shows the number of occurrences of the 7 most popular words in the Brown Corpus.

Notice the curve!



Also for anybody who knows log, it gets even more interesting

The graph below is a log-log chart.

It depicts practically the same thing just with a log graph.




Moreover, this law does not only limit itself to languages, it is even true for
  • Page hits to Popularity rank of the web page
  • Population ranks of cities to City size
  •  Corporation Sizes to corporation ranking
  •  Income ranking to net income
  •  Popularity of TV channel to its TRP


It is like this law is everywhere as if it has been built into the human brains and psychology.

The law is among the most mysterious and there is no specific reason as to why this happens. There are a bunch of theories all over the internet but none do justice to the law and hence have not been included in this article.

This law does not end here, just as any other principle or law in math it is connected to many others and some of the ones are

1.       Benford’s LawPareto principle

All of those will be covered in the posts that follow. Zipf’s law does not end here…
PS- For the above content:
 “The” comes 48 times out of 527 words (approx. 9%)
  “Of” comes 24 times (Exactly Half!= 4.5%)
 “And” comes 15 times (approx. 1/3 = 3%)

While the sample size is very small the law still holds true.
Need I justify this law any further?


Enjoy your high school with - High School Pedia : www.highschoolpedia.com



Comments

Popular Posts

Animal and Plant Cells

 Cells Cells are the basic functional, biological and structural unit of life. The word cell is a Latin word meaning ‘small room’. Cells are also known as building blocks of life.  The branch of science that deals with the form, structure, and composition of a cell is known as Cytology. All organisms around us are made up of cells. Bacteria, ameba, paramecium, algae, fungi, plants and animals are made up of cells.  Cells together form tissues. And tissue together makes an organ. History Of Cell The cell was discovered by Robert Hooke in 1665. He assembled a simple microscope and observed a very thin slice of cork under his primitive microscope. The cork was obtained from the outer covering of a tree called bark. Robert Hooke observed many little-partitioned boxes or compartments in the cork slice. These boxes appeared like a honey-comb. He termed these boxes as the cell. He also noticed that one box was separated from another by a wa...

High School Pedia

It is an initiative by some students to spread the light of knowledge to everyone and everywhere. It was started in the year 2015 and has grown rapidly in the past few months. By the means of this website, we try to provide information on every topic that we can reach up to. You can find different articles on this website. All these articles are written in simple language so that everyone can understand it and learn from it. We at High School Pedia believe in creative learning and this is the reason why we add our own edited graphical representations in every article. Once a very learned man said, “Knowledge increases by not keeping it to yourself but by sharing it with others”. And we follow the same motto “Share to Learn”. The team of High School Pedia tries its best to provide you with the best and original content. Unlike many other websites, High School Pedia is famous for its original and inspiring content.

Levitation 2

LEVITATION II To be completely honest I was going to start this with a pun. I did think of one but it doesn’t float… I am sorry I just had to. Anyway, this is the second part to the article on super cool ways of making things levitate. Go check the first part out if you haven’t already. Actually, the first part may have become repulsive with all the magnets and stuff, but I promise this will be more attractive. Get it? No? I’ll stop now. I am just going to jump straight into it. 1.    Electrostatic Levitation I know you are probably sick and tired of magnets but they are the best way you know… This method is somewhat similar. You remember that cool science experiment you did with two straws attracting or repulsing each other based on their charge? So basically using the same principle we can make a charged object levitate. But before you try it, let me tell you it won’t be easy. Even impossible according to our Mr. Earnshaw. He even made a law (th...

Important Mathematical Constants!

Important Mathematical Constants Mathematical constants are those numbers that are special and interesting because they come up in the various fields of mathematics like geometry, calculus etc. These mathematical constants are usually named after the person who discovered it and they are represented by a symbol that is usually picked up from the Greek alphabet. Mathematical constants are by definition very important. In this article we will take a look at certain mathematical constants that are more commonplace than others. 1.       π (pi) or Archimedes constant (~3.14159):   π is defined as the ratio of the circumference of a circle to its diameter. This is probably the most popular mathematical constant. So π is the circumference of the circle whose diameter is 1 unit. You might have seen it popping up when calculating the area of a circle (πr 2 ) or the circumference of a circle (2πr). It has many uses throughout mathematics from calcula...

Blood Groups

Blood Groups Hey guys, you must have heard people saying "yeah I have AB+ blood group" or " I have the rarest blood group O-" and any sort of these dialogues. But do you actually know what the true meaning of the term Blood Group is. What is Blood Group?? Blood Group is nothing but the type of antibodies and antigens present or absent in your RBC ( Red Blood Cells ). Antigens are located on the surface of your RBC and antibodies in the plasma. It is this combination of antigens and antibodies which decide which blood group you belong to. You inherit these combinations from your parents. But this does not mean that you will have exactly same blood group as your parents. Discovery of Blood Groups Earlier the transfusion of blood from one person to the other would lead to deaths due to incorrect transfer.It was not until 1901, when the Austrian, Karl Land Steiner discovered human blood groups that blood transfusion became safer. Mixing blood from two in...

Leviation

LEVITATION You know the classic magician’s trick in which he makes something or somebody levitate? Yeah well, it isn’t magic (obviously!). In fact, there are over 8 different ways in which he can make something levitate. As a side note though, all of the following methods are fairly complicated. In all honesty what he actually probably does is hang the “thing” by a string. You got conned… What is Levitation? Levitation is flying’s younger brother who was ignored when his elder brother became possible. It’s a sad story… Though in all seriousness levitation and flying are different. So please don’t be a jerk and post a comment saying, “Hey you relic! Levitation is already happening. Have you never sat in a plane before?!” Levitation is (according to Google) and I quote “the action of rising or causing something to rise and hover in the air, typically by means of supposed magical powers.” Way to go Google! I thought we already cleared up that fact that magic isn’t rea...

2-D & 3-D GEOMETRY

2-D & 3-D GEOMETRY We all have some amount of geometry. We know that any line can be represented on the Cartesian plane. Any figure can be drawn on it. But can we represent a 3-d object on it. Yes we can. A Cartesian plane has 2 axis. While representing in 3-D we need to add a third axis. This axis does not come in between the axis or in the same plane. It appears to be coming out of the paper as we cannot represent a 3-d object on a 2-d surface. This new z-axis represents a line coming out of the screen. Before understanding 3-d geometry you need to imagine this axis coming out of the screen.  REMEMBER : all the three axis are perpendicular .i.e there an angle 0f 90 between them and they meet at the origin If you are unable to imagine you can take a thick book as an example. Any corner becomes it origin and the three edges as the three axis REPRESENTING 3-D GEOMETRY Like in 2-d geometry we represent the value of the different axis as (x,y) we use the sa...