Statistics: Types, Formula & Applications

Arpita Srivastava logo

Arpita Srivastava

Content Writer

Statistics is a field of mathematics concerned with the study of data collection, analysis, interpretation, presentation, and organization. It is mostly used to acquire a better understanding of data and to focus on specific applications.

  • Statistics is the process of gathering, assessing, and summarising data into a mathematical form.
  • It was associated with state science.
  • They were used to gather and analyze facts and data about a country's economy and population.
  • The process is based on statistical population and statistical model data.
  • Statistics specify ways in which data can be used to solve complex problems.
  • It can be applied to the field of Linear algebradifferential equations, and probability theories.
  • Weather forecasting, health insurance and sales tracking are examples of statistics.

Key Terms: Statistics, Population, Mean, Median, Mode, Variance, Standard Deviation, Descriptive Statistics, Inferential Statistics, Central tendency, Dispersion, Skewness, Bar Graph, Line Chart, Pie Graph


What is Statistics?

[Click Here for Sample Questions]

Statistics is a field of mathematics used for analyzing and manipulating data. It works on every aspect, including planning, collecting, and representing data.

  • Statistics provides a clear picture of the work you do regularly.
  • Data is divided into quantitative and qualitative information.
  • According to Sir Arthur Lyon Bowley, statistics is defined as the numerical representation of facts of any section.
  • All sections are interrelated to each other.
  • The process is used to measure central tendency and dispersion.

Example of Statistics

Example: Statistics is used to analyse the traffic situation of every city. It is found that engineers regularly monitors the total traffic in entire city on a regular basis to decide the removal or addition of roads. The process is done to initiate a smooth flow of traffic.

Read More: 


Types of Statistics

[Click Here for Sample Questions]

In mathematical statistics, there are two types of statistics for analysing data that are widely used:

Descriptive Statistics

The descriptive technique of statistics is used to measure central tendency and dispersion, which is used to describe the data collected and summarise the data and its attributes.

Inferential Statistics

This statistical strategy is used to generate conclusions from data. Inferential statistics rely on statistical tests on samples to make inferences. It does so by discovering differences between the two groups.

  • The p-value is calculated and compared to the probability of chance() = 0.05.
  • If the p-value is less than or equal to, the p-value is considered statistically significant.

What is Data in Statistics?

[Click Here for Sample Questions]

A data set is a collection of observations and information. These facts and observations can be expressed as numbers, measures, or statements.

  • Qualitative and quantitative data are the two types of data available. 
  • When the data is descriptive or categorical, it is called qualitative data.
  • When the data is numerical, it is called quantitative data. 

After we know the data gathering procedures, we want to depict the obtained data in several forms of graphs such as a bar graph, line graph, pie chart, stem and leaf plots, scatter plot, and so on. 

  • Outliers that are caused by invariability and data measurements are deleted before the data is analysed. 

Representation of Data

[Click Here for Sample Questions]

Let's have a look at the many types of data representation used in statistics.

Bar Graph

A bar graph is a collection of data depicted by rectangular bars with lengths proportional to the values. The bars can be plotted either in vertical or horizontal orientation. 

Pie Chart 

The pie chart is a graph in which a circle is divided into sectors, each representing a percentage of the total.

Line Graph 

The data is represented in a line graph as a series of dots connected by a straight line. These are referred to as markers.

Pictograph

A pictograph is data presented in the form of visuals. Different numbers can be used to represent pictorial symbols for words, objects, or sentences.

Histogram

The histogram is a graph with rectangles in which the size is related to the frequency of a variable, and the width is equal to the class interval. 

Frequency Distribution

In statistics, the frequency distribution table shows the data in ascending order with their corresponding frequencies. The letter f frequently symbolises the frequency of data.

Also Read:


Measures of Central Tendency

[Click Here for Sample Questions]

The basis of descriptive statistics is the Measures of Central Tendency and the Measures of Dispersion. The measure of central tendency tells us where data points are centered.

  • It is called the representative value for the given data.
  • This is done to see how the data is dispersed around the centered metric.
  • To discover the central measures of tendency, we employ the mean, median, and mode.

Example of Central Tendency

Example: We see the average height of students, the average wealth, the average exam result, or the average player height in our daily lives.

The following are the many measures of central tendency for the data:

Mean

Mean is defined as the arithmetic average of a data set. It is calculated by adding all of the numbers in the set and dividing by the number of observations in the data set.

Example of Mean

Example: Consider the following data set which represents the marks obtained by different students in a subject. Calculate the mean.

Name Anu Kusum Gaurav Ashwin Geeta Shakti
Marks Obtained (out of 100) 73 89 73 79 73 60

Ans. The mean of the marks obtained by students:

(73 + 89 + 73 + 79 + 73 + 60 )/6 = 74.5

Median

The median is the middle number in the data set, regardless of whether it is given in ascending or descending order.

Example of Median

Example: Consider the following data set which represents the marks obtained by different students in a subject. Calculate the median.

Name Anu Kusum Gaurav Ashwin Geeta Shakti
Marks Obtained (out of 100) 73 89 73 79 73 60

Ans. Formula used for calculating median is n/2 as number of terms is even.

The median of the marks obtained by students is 6/2 = 3.

Mode

The median is the number that appears the most in a data set and falls between the highest and lowest values.

Example of Mode

Example: Consider the following data set which represents the marks obtained by different students in a subject. Calculate the mode.

Name Anu Kusum Gaurav Ashwin Geeta Shakti
Marks Obtained (out of 100) 73 89 73 79 73 60

Ans. The mode of the marks obtained by students is 73.

Formula for Mean Median and Mode

Formula for Mean Median and Mode

 

Mean, Median and Mode Video Explanation


Measures of Dispersion

[Click Here for Sample Questions]

The measures of central tendency is insufficient to convey all of the information about a set of data. As a result, we must use a metric called the measure of dispersion to characterize the variability.

  • Dispersion is defined as the process of interpreting the data of population.
  • It determines the extent to which numerical data is varied with respect to the average value.
  • Measure of disperison is also known as measure of variability.

The following are the various measure of dispersion:

Range

In statistics, the range is determined as the difference between the data points' highest and minimum values.

Range = Maximum Value – Minimum Value

Standard Deviation and Variance

The variance and standard deviation are two more well-known statistical methods for determining the right measure of dispersion. Taking squares of all the deviations can be tricky when calculating the mean deviation around the mean and the median.

\(If {\Sigma^N_{i=1}(X_{i} - \bar{X})^2 \text{becomes zero},}\)

  • When the total for the mean is zero, it indicates that there is no dispersion at all.
  • The observations are closer to the mean if the sum is small, indicating a reduced degree of dispersion.
  • There is a greater degree of dispersion of the observations from the mean when the sum is large, marked as σ2.
  • It is termed as the variance and is denoted as

σ2\(\frac{\Sigma^N_{i=1}(X_i - \bar{X})^2}{n}\)

  • The standard deviation is the positive square root of the variance is denoted as 

 \(\sigma = \sqrt\frac{\Sigma^N_{i=1}(X_i - \bar{X})^2}{n}\)

Quartile Deviation

The absolute measure of dispersion is measured by the quartile deviation. The information is separated into three parts. First calculate the data points' median.

  • The higher quartile refers to the median of data points to the left of this median.
  • The lower quartile is defined as the median of data points to the right of this median.
  • The interquartile range is defined as the difference between the upper and lower quartiles.
  • The quartile deviation accounts for half of this.

Mean Deviation

The mean deviation is a statistical measure used to calculate the average absolute difference between the items in a distribution and the series' mean or median. 

Mean Deviation For ungrouped data

The frequency distributions of data in statistics can be discrete or continuous. In the case of a large number of independent observations,

\(X_1,X_2,X_3,X_r,......X_n \)

  • The following formula is used to compute the mean deviation from the mean and median:
  • Mean Deviation for ungrouped data=Sum of deviation/number of observations 

\(\frac{\Sigma^N_{i=1}(X_i - \bar{X})}{n}\)

Coefficient of Variation

The coefficients of variations of two or more frequency distributions are compared. The standard deviation to the mean, presented as a percentage, is the coefficient of variance in statistics.

\(CV = \frac{\sigma}{\bar{X}} \times 100\)

The distribution with a higher coefficient of variation has more variability around the central value than the distribution with a lower coefficient of variation value.


Different Model of Statistics

[Click Here for Sample Questions]

The different model of statistics are as follows:

Skewness

Skewness is a process of measuring asymmetry in a probability distribution. The process measures the normal probability distribution of data. Its value is equal to positive, negative or zero.

  • The curve is shifted to the left or right when it is skewed.
  • When the curve is extended toward the right, it is known as positively skewed.
  • When the curve is extended toward the left, it is known as negatively skewed.

ANOVA Statistics

ANOVA stands for Analysis of Variance. The model is used to measure the difference in the mean value of the data from the individual data set to measure the dispersion of the data.

Degree of Freedom

Degree of freedom is a type of statistics model that is used to calculate the change in the data set with respect to the value of the data set.

Regression Analysis

Regression Analysis is a type of statistics model that is used to determine the relationship between variables. It gives the relation between dependent variables and independent variables.


Mean Deviation for Discrete Grouped data

[Click Here for Sample Questions]

In a frequency distribution like this, the measurements of the data units are plainly visible. Let's say there are n different data points:

n different data points

n different data points

  • Mean deviation about mean – 

Mean deviation about median

Mean deviation about mean

  • The total of the products of xi observations and their respective frequencies fi divided by the sum of the frequencies is the frequency ratio.

Mean Deviation

Mean Deviation

 
  • Mean deviation about median

To find the median, arrange the observations in increasing order. Calculate the total frequencies. Then find the observation with the highest cumulative frequency is ≥ N/2, where N = sum of frequencies.

  • As a result, we've reached the desired median.
  • We calculate MD to acquire the absolute values of the deviations from the median (median)=

deviations from the median (median)

Mean deviation about median

Read More


Continuous Grouped Data Mean Deviation

[Click Here for Sample Questions]

The data points in this case can take any value within a range and are continuous. Intervals on the real number line can be used to measure and represent them. The frequency with which data is organized into classes can't be counted.

  • Mean deviation about mean

In each class, the mean of the continuous frequency distribution is centered at the midpoint. Then, like in the case of discrete frequency distribution, the same process is performed.

  • Mean deviation about median

Mean deviation about median

Mean deviation about median

  • where the median class is the cf of the class interval: ≥ N/2, N the sum of frequencies.
  • l, f, h, and C are the lower limit, frequency, and width of the median class, respectively.
  • C is the cumulative frequency of the class immediately preceding the median class.


Things to remember

  • Statistics is the science of data collecting and organisation.
  • The analysis employing measures of central tendencies and measures of dispersion is used to interpret the results.
  • Bar graphs, histograms, pie charts, stem and leaf plots, line graphs, and ogives are all used to show the frequency distribution of data.
  • Quantitative which include numerical: discrete and continuous or qualitative which include data might be collected.
  • For practice students can try out Statistics Important Questions and Statistics Revision Notes

Sample questions

Ques. Calculate the standard deviation for the numbers 8,10,12,14,16. (3 marks)

standard deviation for the number

Ques. Calculate the mean deviation from the mean using the given information. (3 marks)

Size(x) 2 4 6 8 10
Frequency f 2 4 5 3 1

mean deviation from the mean

Ques. The variation of the five observations is 8.24, and the mean is 4.4. Find the other two observations if three of the observations are 1, 2, and 6. (5 marks)

Ans. Let's call the remaining two observations a and b.

the other two observations if three of the observations are 1, 2, and 6

4 and 9 are the other two observations.

Ques. The table given below shows the marks obtained by 110 students in class. What will be the mean marks of the students? Use the assumed mean method. (5 marks)

Class 0-10 10-20 20-30 30-40 40-50
Frequency 13 22 25 20 10

Ans. The given data is:

Class (Ci) Frequency (fi) Class mark (xi) di = xi – a fidi
0-10 13 5 5 – 25 = – 20 -260
10-20 22 15 15 – 25 = – 10 -220
20-30 25 25 = a 25-25 = 0 0
30-40 20 35 35-25 = 10 200
40-50 10 45 45-25 = 20 200
Total Σfi =90 Σfidi = -60
  • Let us take Assumed mean = a = 25
  • Using the Assumed Mean Method 
  • Mean = a + (Σfid / Σfi)
  • 25 + (-60 / 90)
  • 25 – ( -2 / 3)
  • (75 + 2) / 3
  • 77 / 3 = 26.3

Thus, the mean marks of the students in the class are 26.3.

Ques. The average of the five integers is 20. If one of the numbers is left out, the mean is 16. Find the number that isn't included. (5 marks)

Ans- Given, n = 5, ¯X¯ = 20

¯X= (∑xi)/n

∑xi= 5 × 20 = 100

Thus, the sum of 5 observations = 100

Let "a" be the excluded number 

Total of 4 numbers = 100 - a

Mean of 4 numbers = (100 - a)/4
16 = (100 - a)/4
100 - a = 64
a = 36

⇒ The missing number is 36.

Ques. Consider the following numbers: 56, 60, 50, 42, 78, 40, 20. What is the median? (3 marks)

Ans. In increasing order, the numbers are 20,40,42,50,56,60,78

n (number of observations) = 7 in this case.

average of (n/2)th and (n/2+1)th observation is median if the data set is even.

As a result, (7 + 1)/2 = 4

4th observation = median

54 is the median.

Ques: Find the variance of the number 1, 2, 3, 4, 5, 6, 7, 8, 9, 10. (4 marks)

Ans: Find the mean value of 10 values given above

Mean= (1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10) / 10

Mean = 55/ 10

Mean = 5.5

Value N N - N¯ ( N -N¯)2
1 -4.5 20.25
2 -3.5 12.25
3 -2. 5 6.25
4 -1.5 2.25
5 -0.5 0.25
6 +0.5 0.25
7 +1.5 2.25
8 +2.5 6.25
9 +3.5 12.25
19 +4.5 20.25
Total 0 82.50

Now, to find population variance:

\(\sigma^2 = \frac{[\sqrt{\Sigma (n - n -1)}]}{\sqrt{N - 1}}\)

= 82.5/10

= 8.25

Ques. 140 cm, 150 cm, 139 cm, 150 cm, and 123 cm are the heights of five persons, find the mean height. (3 marks)

Ans. x = (140 + 150 + 139 + 150 + 123)/5 = 702/5 = 140.4 = mean height

x =140.4 cm on average

As a result, the mean height is 140.4 cm.

Ques. A group of students surveyed 20 homes in a locality on the number of plants they have in their homes. (5 marks)

Number of Plants  0 - 2 2 - 4 4 - 6 6 - 8  8 - 10 10 - 12 12 - 14
Number of Houses 10 20 10 50 60 20 30
Find the mean number of plants per household using the assumed mean method of calculating mean.

Ans. The data is given as:

No. of Plants No.of Houses (fi) Xi di= xi - a fidi
0-2 10 1 1-7=-6 -60
2-4 20 3 3-7=-4 -80
4-6 10 5 5-7=-2 -20
6-8 50 7=a 7-7=0 0
8-10 60 9 9-7=2 120
10-12 20 11 11-7=9 180
12-14 30 13 13-7=6 180
Total Σfi =200 Σfidi = 320
  • We have taken 7 as the assumed mean here.
  • Using the assumed mean method, 
  • Mean = a + (Σfid / Σfi)
  • 7 + (320/ 20)
  • 7+ 16
  • 23

The required answer is 23

Ques. Find the median of the data 30, 32, 41, 44, 42, 24, 40, 49, 33. (2 marks)

Ans. Arrange the data in ascending order we get,

24,30,32,33,4041,42,44,49

Here, n = 9 which is odd

Median = Value of (9+1/2) th observation = Value of 5 th observation = 40.

Ques. The average of the five integers is 30. If one of the numbers is left out, the mean is 15. Find the number that isn't included. (5 marks)

Ans- Given, n = 5, ¯X¯ = 30

¯X= (∑xi)/n

∑xi= 5 × 30 = 150

Thus, the sum of 5 observations = 150

Let "a" be the excluded number 

Total of 4 numbers = 150 - a

Mean of 4 numbers = (150 - a)/4
15 = (150 - a)/4
150 - a = 60
a = 90

⇒ The missing number is 90.


Read Also:

CBSE X Related Questions

1.

The lengths of 40 leaves of a plant are measured correct to the nearest millimetre, and the data obtained is represented in the following table :

Length (in mm)

Number of leaves

118 - 126

3

127 - 135 

5

136 - 144

9

145 - 153

12

154 - 162

5

163 - 171

4

172 - 180

2

Find the median length of the leaves. 
(Hint : The data needs to be converted to continuous classes for finding the median, since the formula assumes continuous classes. The classes then change to 117.5 - 126.5, 126.5 - 135.5, . . ., 171.5 - 180.5.)

      2.

      Form the pair of linear equations for the following problems and find their solution by substitution method.

      (i) The difference between two numbers is 26 and one number is three times the other. Find them.

      (ii) The larger of two supplementary angles exceeds the smaller by 18 degrees. Find them.

      (iii) The coach of a cricket team buys 7 bats and 6 balls for Rs 3800. Later, she buys 3 bats and 5 balls for Rs 1750. Find the cost of each bat and each ball.

      (iv) The taxi charges in a city consist of a fixed charge together with the charge for the distance covered. For a distance of 10 km, the charge paid is Rs 105 and for a journey of 15 km, the charge paid is Rs 155. What are the fixed charges and the charge per km? How much does a person have to pay for travelling a distance of 25 km.

      (v) A fraction becomes\(\frac{ 9}{11}\), if 2 is added to both the numerator and the denominator. If, 3 is added to both the numerator and the denominator it becomes \(\frac{5}{6}\). Find the fraction.

      (vi) Five years hence, the age of Jacob will be three times that of his son. Five years ago, Jacob’s age was seven times that of his son. What are their present ages?

          3.
          Find the sums given below :
          1. \(7 + 10\frac 12+ 14 + ....... + 84\)
          2. \(34 + 32 + 30 + ....... + 10\)
          3. \(–5 + (–8) + (–11) + ....... + (–230)\)

              4.

              Solve the following pair of linear equations by the substitution method. 
              (i) x + y = 14 
                  x – y = 4   

              (ii) s – t = 3 
                  \(\frac{s}{3} + \frac{t}{2}\) =6 

              (iii) 3x – y = 3 
                    9x – 3y = 9

              (iv) 0.2x + 0.3y = 1.3 
                   0.4x + 0.5y = 2.3 

              (v)\(\sqrt2x\) + \(\sqrt3y\)=0
                  \(\sqrt3x\) - \(\sqrt8y\) = 0

              (vi) \(\frac{3x}{2} - \frac{5y}{3}\) =-2,
                  \(\frac{ x}{3} + \frac{y}{2}\) = \(\frac{ 13}{6}\)

                  5.
                  An umbrella has 8 ribs which are equally spaced (see Fig. 11.10). Assuming umbrella to be a flat circle of radius 45 cm, find the area between the two consecutive ribs of the umbrella.
                  An umbrella has 8 ribs which are equally spaced

                      6.

                      Prove the following identities, where the angles involved are acute angles for which the expressions are defined:\(\frac{(\text{1 + tan² A})}{(\text{1 + cot² A})} = (\frac{\text{1 - tan A }}{\text{ 1 - cot A}})^²= \text{tan² A}\)

                          Comments



                          No Comments To Show