Statistics

Hey, guys ! Today we are gonna talk about a topic which many of us find it way to simple but is actually really complicated. Today we will be discussing the concepts of Statistics.

Introduction
Data means information or a set of given facts. The data is usually collected through census or surveys. The survey is the process of collecting information from a selected group of persons. Data collected by this process is called the raw data. The raw data is classified into two types; the primary data and the secondary data. Primary data is reliable and the secondary data may or may not be reliable. Classification of data is essential for analysing it. Statistics is defined as the collection, presentation, analysis and interpretation of numerical data.

Variable ( or variate )
The value of each item in a data has certain characteristics. The characteristics like intelligence, beauty etc are non-measurable but vary from person to person or from item to item. We call such quantities as qualitative variables. The characteristics like height, weight, marks etc are measurable and these variables are called quantitative variables. We also make a distinction between the observed value and the possible value of a variable. For example, if we have a variable defined by the sex of a person, the possible values of the variable are male and female. But if we are considering the no. of males in the age group 20-25 in a village, we are considering the observed values of the variables defined by a male.

• A variable (or variate) which is not capable of assuming all values in a given range is called a discrete variable.
• A variable which is capable of assuming all the numerical values in a given range is called a continuous variable.

Frequency Distribution
Let the data regarding the weights (in kgs) of 20 students of a class be given as

 48 50 49 54 61 60 54 55 48 49 55 60 50 48 57 62 49 50 52 54
This is called raw data. This is also called an individual series. We note that some of the weights (values of the quantitative variable ) are repeated. If there are 3 students having weight 50 Kg then we say the frequency of 50 is. Therefore, the number of times the value of the item is repeated is called the frequency of that value. The table containing the weights and the corresponding frequencies is given as

 Weight ( in kg ) Tally by bars No. of students                ( frequency  ) 50 48 54 49 60 61 55 57 62 ||| ||| ||| ||| || | || | | 3 3 3 3 2 1 2 1 1

Tally bars are used to count the number of times the values of the variables has occurred. The table containing the values and its frequencies is called a frequency distribution. The variable is denoted by x and the frequency by f. In the order of magnitude, the frequency distribution is written as follows;

 Weight (in kg) x No. of students f 48 49 50 52 54 55 57 60 61 62 3 3 3 1 3 2 1 2 1 1 Total 20

We denote the total number of students, that is the total frequency by n i.e. n = Σ f. Also, we denote different values of the variables x as xᵢ and different frequencies by fᵢ.

Let the data be classified according to different classes of values of the variable. This is an important tool in condensing a large data. In the above example, the classes may be defined as; 45 and under 50. 50 and under 55, 55 and under 60 etc.
We denote these classes by 45-49, 50-54, 55-59 etc. Usually, the length of the class is taken as same. With the length of the class as 5, the above frequency distribution can be displayed as

 Weight (in kg) class x No. of students f 45-49 50-54 55-59 60-64 6 7 3 4 Σf =20 = n

In the above frequency table 45-49, 50-54 are called class intervals. 45-49 is one of the class intervals in which 45 is the lower class limit and 49 is the upper class limit.

The classes are written in two forms:
1. Inclusive form: In this case, the lower limit of a class is not equal to the upper limit of the previous class. For example: 45-49, 50-54, 55-59, 60-64 are in inclusive form. However, in the class 45-49, all items with values greater or equal to 44.5 but less than 49.5 are to be taken. thus actual limits are 44.5 - 49.5, 49.5 - 54.5, 54.5 - 59.5, 59.5 - 64.5.
2. Exclusive form: In this case, the lower limit of a class is equal to the upper limit of the previous class.
For example- we may have classes of the form 45 - 50, 50 - 55, 55 - 60, 60 - 65 etc. The value of
50 is counted in class 50 and under 55 and not in 45 and under 50.
In both the forms, the length of classes ( upper limit - lower limit ) is same.