# All Science Fair Projects

## Science Fair Project Encyclopedia for Schools!

 Search    Browse    Forum  Coach    Links    Editor    Help    Tell-a-Friend    Encyclopedia    Dictionary

# Science Fair Project Encyclopedia

For information on any area of science that interests you,
enter a keyword (eg. scientific method, molecule, cloud, carbohydrate etc.).
Or else, you can start by choosing any of the categories below.

# Chi-square distribution

For any positive integer k, the chi-square distribution with k degrees of freedom is the probability distribution of the random variable

$X=Z_1^2 + \cdots + Z_k^2$

where $Z_1, \cdots, Z_k$ are independent standard normal variables (zero expected value and unit variance). This distribution is usually written

$X\sim\chi^2_k$

The chi-square test can be used to test independence as well as goodness of fit.

An example of a test of independence would be if sex and political affiliation are connected. So you would gather your sample, your expected value, find your critical value, and if the chi-square test is greater than the critical value, you can reject the null, otherwise, you fail to reject the null. (you never accept the null)

The chi-square probability density function is

$p_k(x) = \frac{(1/2)^{k/2}}{\Gamma(k/2)} x^{k/2 - 1} e^{-x/2} \quad \mbox{ for }x > 0$

and pk(x) = 0 for $x \le 0$. Here Γ denotes the gamma function. Tables of this distribution — usually in its cumulative form — are widely available (see the External links below for online versions), and the function is included in many spreadsheets (for example OpenOffice.org calc or Microsoft Excel) and all statistical packages.

If p independent linear homogeneous constraints are imposed on these variables, the distribution of X conditional on these constraints is $\chi^2_{k-p}$, justifying the term "degrees of freedom". The characteristic function of the Chi-square distribution is

φ(t) = (1 - 2it) - k / 2

The chi-square distribution has numerous applications in inferential statistics, for instance in chi-square tests and in estimating variances. It enters the problem of estimating the mean of a normally distributed population and the problem of estimating the slope of a regression line via its role in Student's t-distribution. It enters all analysis of variance problems via its role in the F-distribution, which is the distribution of the ratio of two independent chi-squared random variables.

#### The normal approximation

If $X\sim\chi^2_k$, then as k tends to infinity, the distribution of X tends to normality. However, the tendency is slow (the skewness is $\sqrt{8/k}$ and the kurtosis is 12 / k) and two transformations are commonly considered, each of which approaches normality faster than X itself:

Fisher showed that $\sqrt{2X}$ is approximately normally distributed with mean $\sqrt{2k-1}$ and unit variance.

Wilson and Hilferty showed in 1931 that $\sqrt[3]{X/k}$ is approximately normally distributed with mean 1 - 2 / (9k) and variance 2 / (9k).

The expected value of a random variable having chi-square distribution with k degrees of freedom is k and the variance is 2k. The median is given approximately by

$k-\frac{2}{3}+\frac{4}{27k}-\frac{8}{729k^2}$

Note that 2 degrees of freedom leads to an exponential distribution.

The chi-square distribution is a special case of the gamma distribution.

The information entropy is given by:

$H = \int_{-\infty}^\infty p(x)\ln(p(x)) dx = \frac{k}{2} + \ln \left( 2 \Gamma \left( \frac{k}{2} \right) \right) + \left(1 - \frac{k}{2}\right) \psi(k/2)$

where ψ(x) is the Digamma function.