Correlation

Correlation

The relationship between two variables such that a change in one variable results in a positive or negative change in the other variable is known as correlation.

Types of correlation

    1. Perfect correlation: If the two variables vary in such a manner that their ratio is always constant, then the correlation is said to be perfect.
    2. Positive or direct correlation: If an increase or decrease in one variable corresponds to an increase or decrease in the other, the correlation is said to be positive.
    3. Negative or indirect correlation: If an increase or decrease in one variable corresponds to a decrease or increase in the other, the correlation is said to be negative.

Correlation 1

Karl Pearson’s coefficient of correlation

The correlation coefficient r(x, y), between two variable x and y is given by,
Correlation 2

Modified formula

Correlation 3

Step deviation method

Let A and B are assumed mean of xi and yi respectively, then
Correlation 4

Rank correlation

Let us suppose that a group of n individuals is arranged in order of merit or proficiency in possession of two characteristics A and B.
These rank in two characteristics will, in general, be different.
For example, if we consider the relation between intelligence and beauty, it is not necessary that a beautiful individual is intelligent also.
Correlation 5
If all d’s are zero, then r = 1, which shows that there is perfect rank correlation between the variable and which is maximum value of r. If however some values of xi are equal, then the coefficient of rank correlation is given by
Correlation 6
where m is the number of times a particular xi is repeated.

Positive and Negative rank correlation coefficients

Let r be the rank correlation coefficient then, if r > 0, it means that if the rank of one characteristic is high, then that of the other is also high or if the rank of one characteristic is low, then that of the other is also low.

  • r = 1, it means that there is perfect correlation in the two characteristics i.e., every individual is getting the same ranks in the two characteristics.
  • r < 1, it means that if the rank of one characteristics is high, then that of the other is low or if the rank of one characteristics is low, then that of the other is high.
  • r = –1, it means that there is perfect negative correlation in the two characteristics i.e, an individual getting highest rank in one characteristic is getting the lowest rank in the second characteristic. Here the rank, in the two characteristics in a group of n individuals are of the type (1, n), (2, n – 1), ….., (n, 1).
  • r = 0, it means that no relation can be established between the two characteristics.

Standard error and probable error

(1) Standard error of prediction:
The deviation of the predicted value from the observed value is known as the standard error prediction and is defined as
Correlation 7
where y is actual value and yp is predicted value.
In relation to coefficient of correlation, it is given by
Correlation 8
Correlation 9

  1. If r < P.E.(r), there is no evidence of correlation.
  2. If r < 6P.E.(r), the existence of correlation is certain.

The square of the coefficient of correlation for a bivariate distribution is known as the “Coefficient of determination”.

What is a Grouped Frequency Distribution Table

What is a Grouped Frequency Distribution Table

There are 3 methods for calculation of mean :

  1. Direct Method
  2. Assumed mean deviation method
  3. Step deviation method.

1. Direct Method for Calculation of Mean
What is a Grouped Frequency Distribution Table 1
According to direct method
What is a Grouped Frequency Distribution Table 2

2. Assumed Mean Method
Arithmetic mean = \(a + \frac{{\sum\limits_{i = 1}^n {{f_i}{d_i}} }}{{\sum\limits_{i = 1}^n {{f_i}} }}\)
Note : The assumed mean is chosen, in such a manner, that

  1. It should be one of the central values.
  2. The deviation are small.
  3. One deviation is zero.

Working Rule :
Step 1 :       Choose a number ‘a’ from the central values of x of the first column, that will be our assumed mean.
Step 2 :      Obtain deviations di by subtracting ‘a’ from xi. Write down hese deviations against the corresponding frequencies in the third column.
Step 3 :      Multiply the frequencies of second column with corresponding deviations di in the third column to prepare a fourth column of fidi.
Step 4 :      Find the sum of all the entries of fourth column to obtain ∑fidi and also, find the sum of all the frequencies in the second column to obtain ∑fi.

Read More:

3. Step Deviation Method
Deviation method can be further simplified on dividing the deviation by width of the class interval h. In such a case the arithmetic mean is reduced to a great extent.
Mean (\(\bar x\)) = a + \(\frac{{\Sigma {f_i}{u_i}}}{{\Sigma {f_i}}} \times h\)
Working Rule :
Step-1 :     Choose a number ‘a’ from the central values of x(mid-values)
Step-2 :    Obtain ui = \(\frac{{{x_i} – a}}{h}\)
Step-3 :    Multiply the frequency fi with the corresponding ui to get fiui.
Step-4 :    Find the sum of all fiui.e., ∑fiui
Step-5 :     Use the formula  = a + \(\frac{{\Sigma {f_i}{u_i}}}{{\Sigma {f_i}}} \times h\) to get the required mean.

Grouped Frequency Distribution Table Example Problems with Solutions

Example 1:    

Mid-values23456
Frequencies4943573813

Find the mean by direct method.

Solution:

Mid Values  frequencies (fi)fixi
24998
343129
457228
538190
61378
TotalN = Σfi = 50Σfixi = 2750

Mean = \(\frac{{\sum {f_i}{x_i}}}{{\sum {f_i}}}\) = \(\frac{{723}}{{200}}\) = 3.615

Example 2:    Find the mean of the following frequency distribution :

Class IntervalFrequency
10-3090
30-5020
50-7030
70-9020
90-11040

Solution:

Class IntervalfMid value (x)f × x
10-3090201800
30-502040800
50-7030601800
70-9020801600
90-110401004000
Σf = 200Σfx = 10000

Mean = \(\frac{{\sum {f}{x}}}{{\sum {f}}}\) = \(\frac{{10000}}{{200}}\) = 50

Example 3:    A survey was conducted by a group of students as a part of their environment awareness programme, in which they collected the following data regarding the number of plants in 20 houses in a locality. Find  the mean number of plants per house.

Number of plants0 – 22 – 44 – 66 – 88 – 1010 – 1212 – 14
No. of houses1215623

Which method did you use for finding the mean and why ?
Solution:

Number of plantsNumber of houses (f)Mid value (x)f × x
0-2111
2-4236
4-6155
6-85735
8-106954
10-1221122
12-1431339
Σf = 20Σfx = 162

Mean = \(\frac{{\sum {f}{x}}}{{\sum {f}}}\) = \(\frac{{162}}{{20}}\) = 8.1

Example 4:    Calculate the mean for the following distribution:

Variable56789
Frequency4814113

Solution:
What is a Grouped Frequency Distribution Table 3
∴ Mean = \(\frac{{\sum f\,x}}{{\sum f}} = \frac{{281}}{{40}}\)  = 7.025

Example 5:    Find the mean of the following frequency distribution :
What is a Grouped Frequency Distribution Table 4
Solution:
What is a Grouped Frequency Distribution Table 5
Mean = \(\frac{{\sum f\,x}}{{\sum f}} = \frac{{4930}}{{150}} = 32.8\overline 6\) or 32.87 (approx.)

Example 6:    Find the mean of the following distribution by direct method.

Class interval0 – 1011 – 2021 – 3031 – 4041 – 50
Frequency34256

Solution:
What is a Grouped Frequency Distribution Table 6
Mean = \(\frac{{\sum f\,x}}{{\sum f}} = \frac{{578.5}}{{20}}\) = 28.9

Example 7:    For the following distribution, calculate mean using all the suitable methods.

Size of Item1 – 44 – 99 – 1616 – 27
Frequency6122620

Solution:
What is a Grouped Frequency Distribution Table 7
Mean = \(\frac{{\sum f\,x}}{{\sum f}} = \frac{{848}}{{64}}\)  = 13.25

Example 8:    The following table gives the distribution of total household expenditure (in rupees) of manual workers in a city.

Expenditure (in rupees) 100-150150-200200-250250-300300-350350-400400-450450-500
Frequency244033283022167

Solution:    Let assumed mean = 275
What is a Grouped Frequency Distribution Table 8
\(\bar x = a + \frac{{\Sigma {f_i}{d_i}}}{{\Sigma {f_i}}}\) = 275 + \(\frac{{ – 1750}}{{200}}\) = Rs 266.25

Example 9:    Calculate the arithmetic mean of the following distribution :

Class IntervalFrequency
0 – 5017
50 –10035
100 –15043
150–20040
200– 25021
250– 30024

Solution:    Let assumed mean = 175 i.e. a = 175
What is a Grouped Frequency Distribution Table 9
Now , a = 175
\(\bar x = a + \frac{{\Sigma {f_i}{d_i}}}{{\Sigma {f_i}}}\) = 175 + \(\frac{{ – 4750}}{{180}}\)
= 175 – 26.39 = 148.61 approx.

Example 10:    Calculate the arithmetic mean of the following frequency distribution :

Class interval 50– 6060–7070–8080–9090– 100
Frequency86121113

Solution:    Let assumed mean = 75 i.e., a = 75
What is a Grouped Frequency Distribution Table 10
a = 75, Σfidi= 150, Σfi = 50
Mean \(\bar x = a + \frac{{\Sigma {f_i}{d_i}}}{{\Sigma {f_i}}}\) = 75 + \(\frac{{ 150}}{{50}}\) = 78

Example 11:    Thirty women were examined in a hospital by a doctor and the number of heart beats per minute were recorded and summarised as follows. Find the mean heart beats per minute for these women, choosing a suitable method.

Number of heart beats per minuteFrequency
65– 682
68–714
71–743
74–778
77– 807
80– 834
83– 862

Solution:    Let assumed mean a  = 75.5
What is a Grouped Frequency Distribution Table 11
Mean = \(a + \frac{{\Sigma fd}}{{\Sigma f}} = 75.5 + \frac{{12}}{{30}}\) = 75.5 + 0.4 = 75.9

Example 12:    To find out the concentration of SO2 in the air (in parts per million, i.e.ppm), the data was collected for 30 localities in a certain city and is presented below :
What is a Grouped Frequency Distribution Table 12
Find the mean concentration of SO2 in the air.
Solution:    Let the assumed mean a = 0.10.
What is a Grouped Frequency Distribution Table 13
By step deviation method
Mean = a + \(\frac{{\Sigma {f_i}{u_i}}}{{\Sigma {f_i}}}\) × h
= 0.10 + \(\frac{{–1}}{{30}} \times 0.04\)
= 0.10 – 0.0013
= 0.0987
= 0.099 ppm

Example 13:    The weekly observation on cost of living index in a certain city for the year 2004–2005 are given below. Compute the mean weekly cost of living index.
What is a Grouped Frequency Distribution Table 14
Solution:    Let assumed mean be 1750 i.e., a = 1750
What is a Grouped Frequency Distribution Table 15
By step deviation method
Mean (\(\bar x\)) = a + \(\frac{{\Sigma {f_i}{u_i}}}{{\Sigma {f_i}}}\) × h
= 1750 + \(\frac{{ – 45}}{{52}} \times 100\)
= 1750 – 86.54
= 1663.46
Hence, the mean weekly cost of living index
= 1663.46

Example 14:    Find the mean marks from the following data by step deviation method
What is a Grouped Frequency Distribution Table 16
Solution:    Let assumed mean = 55 ⇒ a = 55
What is a Grouped Frequency Distribution Table 17
Here, a = 55, h = 10,
Σfi = 85,  Σfiui = –56
Mean (\(\bar x\)) = a + \(\frac{{\Sigma {f_i}{u_i}}}{{\Sigma {f_i}}}\) × h
h = 55 + \(\frac{{ – 56}}{{85}} \times 10\)
= 55 – 6.59 = 48.41
Hence, mean mark = 48.41.

Example 15:    Find the mean age of 100 residents of a colony from the follwing data :
What is a Grouped Frequency Distribution Table 18
Solution:    Let assumed mean a = 35
What is a Grouped Frequency Distribution Table 19
Here, a = 35,  h = 10
\(\bar x\) = a + \(\frac{{\Sigma {f_i}{u_i}}}{{\Sigma {f_i}}}\) × h
⇒  \(\bar x\)  = 35 + \(\frac{{ – 40}}{{100}} \times 10\) = 31
Hence, the mean age = 31 years

Example 16:    The following distribution show the daily pocket allowance of children of a locality. The mean pocket allowance is Rs. 18.00. Find the missing frequency f.
What is a Grouped Frequency Distribution Table 20
Solution:    we have,
What is a Grouped Frequency Distribution Table 21
Mean  \(\bar x\) = \(\frac{{\Sigma fx}}{{\Sigma f}}\)  ⇒   18 = \(\frac{{752 + 20f}}{{44 + f}}\)
⇒   18 (44 + f) = 752 + 20f
⇒   752 + 20f = 792 + 18f
⇒   2f = 40
⇒      f = 20
Hence, the missing frequency is 20.

Example 17:    The arithmetic mean of the following frequency distribution is 50. Find the value of p.
What is a Grouped Frequency Distribution Table 22
Solution:    
What is a Grouped Frequency Distribution Table 23
Mean \(\bar x\) = \(\frac{{\Sigma fx}}{{\Sigma f}}\)  ⇒  50 = \(\frac{{5160 + 30P}}{{92 + P}}\)
⇒   50 (92 + P) = 5160 + 30 P
⇒   4600 + 50 P = 5160 + 30P
⇒   20 P = 560
⇒   P = 28

Example 18:    The mean of the following frequency distribution  is 62.8 and the sum of all frequencies is 50. Compute the missing frequencies f1 and f2 :
What is a Grouped Frequency Distribution Table 24
Solution:    
What is a Grouped Frequency Distribution Table 25
30 + f1 + f2 = 50 ⇒ f1 + f2 = 20    ….(1)
Mean  = \(\frac{{\Sigma fx}}{{\Sigma f}}\) ⇒  62.8 = \(\frac{{2060 + 30{f_1} + 70{f_2}}}{{50}}\)
⇒  62.8 = \(\frac{{206 + 3{f_1} + 7{f_2}}}{5}\)
⇒  206 + 3f1 + 7f2 = 314
⇒   3f1 + 7f2 = 108                     ….(2)
3f1 + 3f2 = 60                            ….(3)
[Multiplying (1) by 3]
On Subtracting (3) from (2), we get
4f2 = 48  ⇒  f2 = 12
Putting f2 = 12 in (1), we get
f1 = 8