Science Fair Projects Ideas - Variance

All Science Fair Projects

      

Science Fair Project Encyclopedia for Schools!

  Search    Browse    Forum  Coach    Links    Editor    Help    Tell-a-Friend    Encyclopedia    Dictionary     

Science Fair Project Encyclopedia

For information on any area of science that interests you,
enter a keyword (eg. scientific method, molecule, cloud, carbohydrate etc.).
Or else, you can start by choosing any of the categories below.

Variance

This article is about mathematics. Alternate meaning: variance (land use).

In probability theory and statistics, the variance of a random variable is a measure of its statistical dispersion, indicating how far from the expected value its values typically are. The variance of a real-valued random variable is its second central moment, and also its second cumulant (cumulants differ from central moments only at and above degree 4).

Contents

Definition

If μ = E(X) is the expected value (mean) of the random variable X, then the variance is

\operatorname{var}(X)=\operatorname{E}((X-\mu)^2).

That is, it is the expected value of the square of the deviation of X from its own mean. In plain language, it can be expressed as "The average of the square of the distance of each data point from the mean". It is thus the mean squared deviation. The variance of random variable X is typically designated as \operatorname{var}(X), \sigma_X^2, or simply σ2.

Note that the above definition can be used for both discrete and continuous random variables.

Many distributions, such as the Cauchy distribution, do not have a variance because the relevant integral diverges. In particular, if a distribution does not have expected value, it does not have variance either. The opposite is not true: there are distributions for which expected value exists, but variance does not.

Properties

If the variance is defined, we can conclude that it is never negative because the squares are positive or zero. The unit of variance is the square of the unit of observation. For example, the variance of a set of heights measured in centimeters will be given in square centimeters. This fact is inconvenient and has motivated many statisticians to instead use the square root of the variance, known as the standard deviation, as a summary of dispersion.

It can be proven easily from the definition that the variance does not depend on the mean value μ. That is, if the variable is "displaced" an amount b by taking X+b, the variance of the resulting random variable is left untouched. By contrast, if the variable is multiplied by a scaling factor a, the variance is multiplied by a2. More formally, if a and b are real constants and X is a random variable whose variance is defined,

\operatorname{var}(aX+b)=a^2\operatorname{var}(X)

Another formula for the variance that follows in a straightforward manner from the above definition is:

\operatorname{var}(X)=\operatorname{E}(X^2) - (\operatorname{E}(X))^2.

This is often used to calculate the variance in practice.

One reason for the use of the variance in preference to other measures of dispersion is that the variance of the sum (or difference) of independent random variables is the sum of their variances. A weaker condition than independence, called uncorrelatedness also suffices. In general,

\operatorname{var}(X+Y) =\operatorname{var}(X) + \operatorname{var}(Y)  + 2 \operatorname{cov}(X, Y).

Here \operatorname{cov} is the covariance, which is zero for uncorrelated random variables.

Population variance and sample variance

In statistics, the concept of variance can also be used to describe a set of data. When the set of data is a population, it is called the population variance. If the set is a sample, we call it the sample variance.

The population variance of a population yi where i = 1, 2, ..., N is given by

\sigma^2 = \frac{1}{N} \sum_{i=1}^N  \left( y_i - \mu \right) ^ 2,

where μ is the population mean. In practice, when dealing with large populations, it is almost never possible to find the exact value of the population variance, due to time, cost, and other resource constraints.

A common method of estimating the population variance is sampling. When estimating the population variance using n random samples xi where i = 1, 2, ..., n, the following formula is an unbiased estimator:

s^2 = \frac{1}{n-1} \sum_{i=1}^n  \left( x_i - \overline{x} \right) ^ 2,

where \overline{x} is the sample mean.

Note that the n-1 in the denominator above contrasts with the equation for population variance. One common source of confusion is that the term sample variance and notation s2 may refer to either the unbiased estimator of the population variance given above, and to what is strictly speaking the variance of the sample, computed by using n instead of n-1.

Intiutively, computing the variance by dividing by n instead of n-1 gives an underestimate of the population variance. This is because we are using the sample mean \overline{x} as an estimate of the population mean μ, which we do not know. In practice, for large n, the distinction is often a minor one.

Sample variance as an unbiased estimator

We will demonstrate why the sample variance is an unbiased estimator of the population variance. An estimator \hat{\theta} for a parameter θ is unbiased if \operatorname{E}\{ \hat{\theta}\} = \theta. Therefore, to prove that the sample variance is unbiased, we will show that \operatorname{E}\{ s^2\} = \sigma^2. As an assumption, all of the samples xi have mean μ and variance σ2.

\operatorname{E} \{ s^2 \}  = \operatorname{E} \left\{ \frac{1}{n-1} \sum_{i=1}^n  \left( x_i - \overline{x} \right) ^ 2 \right\}
= \frac{1}{n-1} \sum_{i=1}^n  \operatorname{E} \left\{ \left( x_i - \overline{x} \right) ^ 2 \right\}
= \frac{1}{n-1} \sum_{i=1}^n  \operatorname{E} \left\{ \left( (x_i - \mu) - (\overline{x} - \mu) \right) ^ 2 \right\}
= \frac{1}{n-1} \sum_{i=1}^n  \operatorname{E} \left\{ (x_i - \mu)^2 \right\}  - 2 \operatorname{E} \left\{ (x_i - \mu) (\overline{x} - \mu) \right\}   + \operatorname{E} \left\{ (\overline{x} - \mu)  ^ 2 \right\}
= \frac{1}{n-1} \sum_{i=1}^n  \sigma^2  - 2 \left( \frac{1}{n} \sum_{j=1}^n \operatorname{E} \left\{ (x_i - \mu) (x_j - \mu) \right\} \right)  + \frac{1}{n^2} \sum_{j=1}^n \sum_{k=1}^n \operatorname{E} \left\{ (x_j - \mu) (x_k - \mu) \right\}
= \frac{1}{n-1} \sum_{i=1}^n  \sigma^2  - \frac{2 \sigma^2}{n}  + \frac{\sigma^2}{n}
= \frac{1}{n-1} \sum_{i=1}^n \frac{(n-1)\sigma^2}{n}
= \frac{(n-1)\sigma^2}{n-1} = \sigma^2

See also algorithms for calculating variance.

Generalizations

If X is a vector-valued random variable, with values in Rn, and thought of as a column vector, then the natural generalization of variance is E[(X − μ)(X − μ)T], where μ = E(X) and XT is the transpose of X, and so is a row vector. This variance is a nonnegative-definite square matrix, commonly referred to as the covariance matrix.

If X is a complex-valued random variable, then its variance is E[(X − μ)(X − μ)*], where X* is the complex conjugate of X. This variance is a nonnegative real number.

History

The term variance was first introduced by Ronald Fisher in 1918 paper The Correlation Between Relatives on the Supposition of Mendelian Inheritance.

Moment of inertia

The variance of a probability distribution is equal to the moment of inertia in classical mechanics of a corresponding linear mass distribution, with respect to rotation about its center of mass. It is because of this analogy that such things as the variance are called moments of probability distributions.

See also

03-10-2013 05:06:04
The contents of this article is licensed from www.wikipedia.org under the GNU Free Documentation License. Click here to see the transparent copy and copyright details
Science kits, science lessons, science toys, maths toys, hobby kits, science games and books - these are some of many products that can help give your kid an edge in their science fair projects, and develop a tremendous interest in the study of science. When shopping for a science kit or other supplies, make sure that you carefully review the features and quality of the products. Compare prices by going to several online stores. Read product reviews online or refer to magazines.

Start by looking for your science kit review or science toy review. Compare prices but remember, Price $ is not everything. Quality does matter.
Science Fair Coach
What do science fair judges look out for?
ScienceHound
Science Fair Projects for students of all ages
All Science Fair Projects.com Site
All Science Fair Projects Homepage
Search | Browse | Links | From-our-Editor | Books | Help | Contact | Privacy | Disclaimer | Copyright Notice