Finite Mathematics Lesson 10

Section 7.5

Section 7.6

Chapters 7.5 - 7.6

The variance and the standard deviation are measures that describe the type of data that was collected. Most scientific and business calculators can calculate these statistics.   Spreadsheets such as Excel have extensive statistical features you may want to explore.  I  made some interactive tutorials for these sections.  Check them out on the Course Content page.

Course Notes

Section 7.5   The Variance and Standard Deviation

The variance is a measure of how spread out data is about its mean.  If most of the data points are close to its mean, the smaller the variance.  If most of the data points are spread out relative to its mean, the larger the variance.  Lets compare two different populations that have the same mean.  These are the results from the first two test given in a math course.

Test 1 Test 2
20 50
40 55
60 60
80 65
100 70
Test1 = (20 + 40 +60+ 80 + 100) 5 = 60
Test2 = (50 + 55 +60+ 65 + 70) 5 = 60

As you can see form the test scores above, even though the means (averages) are identical, they do not reflect the same experience. So lets look at another parameter of the population called the variance.   (denoted by the lowercase Greek letter sigma squared, s )

The variances for each population are quite different. Remember, the smaller variance indicates closer the values are to the mean. The standard deviation is used more often than the variance to interpret data.  The symbol for the standard deviation is the Greek letter sigma, s.  It is smaller than the variance and is easily interpreted.  (Chapter 7.5)

The variance of the probability distribution is calculated differently.  Instead of dividing  the sum of the terms ,(xi - )2 , by N,  we multiply each term, (xi - )2 ,  by the its corresponding probability.
[variance] = (x1 - )2 p1 + (x2 - )2 p2 + + (xN - )2 pN  

Example  Find the variance and standard deviation of ...

Outcome

Probability

-2 0.3
0 0.4
2 0.2
12 0.1

Important Remark
The formulas for the variance of a population and the variance of a sample are calculated differently. Recall the sample is not the entire population.

sample variance
population variance

So the standard deviation are different also...

sample standard deviation
population standard deviation

Chebychev's Inequality
Chebychev's Inequality helps us determine the likelihood of having extreme values in the data.  Suppose a probability distribution has mean 40 and standard deviation 2.

Test 1 Chebychev's Inequality
40 The probability that a randomly chosen outcomes
lies between  -and   +is at least

Estimate the probability that the outcome is  between 30 and 50 .

When you are ready, click here for the assignment for this section.

Section 7.6   The Normal Distribution 

The most important distribution in statistics, a normal (or Gaussian) distribution, has probabilities that follow the familiar bell-shaped curve: 

normal.gif (2503 bytes)

A normal distribution is completely specified by giving:

Large s

Small s

large_s.gif (2182 bytes)

small_s.gif (2169 bytes)

Suppose a random variable has a normal distribution:

A value of that quantity is just as likely to lie above the mean as below it (this is why the bell curve is centered on m).
A value of that quantity is less likely to occur the farther it is from the mean (this is why the bell curve decreases in both directions from m).
Values to one side of the mean are of the same probability as values at the same distance on the other side of the mean (this is why the bell curve is symmetric about m).
Values at a distance greater than 3s from m are possible but very unlikely (this is why the bell curve appears to hit the horizontal axis).

How do we use the standard deviation in the normal curve?
The total area under the curve is 1.

The total area under the curve is 1.  So for a normal distribution curve, over 68% of the area is contained by the values that are 1 standard deviation on either side of the mean, over 95% with 2 standard deviations, and over 99% with 3 standard deviations!

Let's look at our first example again.

From  Test 2 above   = 60 and s =7
We will find:
Pr ( - 1s < X < m + 1s) =
Pr ( - 2s < X < m + 2s) =
Pr ( - 3s < X < m + 3s) =

mean60.gif (3911 bytes)

std1.gif (7187 bytes)

Pr ( - 1s < X < m + 1s) =

  + 1s = 60 + 1(7) = 67
  - 1s = 60 - 1(7) = 53

Pr ( 53< X < 67) = 0.683

Note k =1

std2.gif (6979 bytes)

Pr ( - 2s < X < m + 2s) =

  + 2s = 60 + 2(7) = 74
  - 2s = 60 - 2(7) = 46

Pr ( 46< X < 74) = 0.954

Note k =2

std3.gif (7131 bytes)

Pr ( - 3s < X < m + 3s) =

  +3s = 60 +3(7) = 39
  - 3s = 60 - 3(7) = 81

Pr ( 39< X <  81) = 0.997

Note k =3

Remark
When talking about testing, saying a person is within 1 standard deviation of the mean implies that the person is with 68% of the of their classmates.  If you were 3 standard deviations above the mean,  you world be in the top 2.5% of the students.   If you were 3 standard deviations below the mean,  you world be in the bottom 2.5% of the students.

Using the Normal distribution
If you hadn't noticed in the book,  the formula for calculating the area under the normal curve is gnarly. Look at Table 1 in Appendix A in the back of your book.  There is an abbreviated version of it on page 372.  Since the properties of the normal curve are the same regardless of its mean and standard deviation,  we can use one table if we let = 0 and s = 1.

a(z).gif (2943 bytes)
a(z)1.gif (2784 bytes)
A(1.5) - A(-.5) = 0.9332 - 0.3085 = 0.6247

Find the area up to 1.5, then subtract the area below -0.5 to get the area of the shaded region.  Look up A(1.5) and A(-.5) in appendix A.

Converting to the Normal Curve

Most of the time the standard deviation s is not 1 or the mean m is not 0.  Here is an easy way to convert to the Normal curve.

  or written as    z = (x - m) s 

Example #28, page 380
Suppose that IQ scores are normally distributed with m = 100 and s = 10.  What percent of the population have IQ scores greater than 125 or more?

According to this problem,  less than 1 % of the population has an IQ of 125 or over.

Example Try computing the probabilities form above click here

= 60 and s =7  and z = (x - m) s
z = (67 - 60) 7 = 1
z = (53 - 60) 7 = -1

Pr ( -1< Z < 1) = A(1) - A(-1) = .8413 - .1587 = .6826
We arrive at the same value!

Remark
You don't have to remember the formulas on page 352 if you can remember how to convert to a normal curve.
Also I used < symbol instead of less than or equal signs to cut down on download time.   See note on page 348 at bottom of page.

  or written as    z = (x - m) s

When you are ready, click here for the assignment for this section.

Section 7.7   Normal Approximation to the Binomial Distribution

Read this section. This is actually useful.  Look at example #2 on page 383 for a relevant math problem.  Here the mean m = np and   standard deviation is s = squareroot(npq).  So if you know the distribution is binomial, you can use the normal curve  to approximate binomial probability distributions. Just do #1 and #3 on page 385 for practice.

When you are ready, click here for the assignment for this section.

Back to Top

Please notify me of any errors on this page   joe@joemath.com