chi-squared test

hypothesis testing

Also known as: chi-square test

Written by Ken Stewart

Fact-checked by The Editors of Encyclopaedia Britannica

Last Updated: Nov 22, 2024 • Article History

Also called:: chi-square test

Related Topics:: hypothesis testing

See all related content

chi-squared test, a hypothesis-testing method in which observed frequencies are compared with expected frequencies for experimental outcomes.

In hypothesis testing, data from a sample are used to draw conclusions about a population parameter or a population probability distribution. First, a tentative assumption is made about the parameter or distribution. This assumption is called the null hypothesis and is denoted by H₀. An alternative hypothesis (denoted H_a), which is the opposite of what is stated in the null hypothesis, is then defined. The hypothesis-testing procedure involves using sample data to determine whether H₀ can be rejected. If H₀ is rejected, the statistical conclusion is that the alternative hypothesis H_a is true.

The chi-squared test is such a hypothesis test. First, one selects a p-value, a measure of how likely the sample results are to fall in a predicted range, assuming the null hypothesis is true; the smaller the p-value, the less likely the sample results are to fall in a predicted range. If the p-value is less than α, the null hypothesis can be rejected; otherwise, the null hypothesis cannot be rejected. The value of α is often chosen to be 0.05.

One then calculates the chi-squared value. The formula for the chi-squared test isχ² = Σ(O_i − E_i)²/E_i,where χ² represents the chi-squared value, O_i represents the observed value, E_i represents the expected value (that is, the value expected from the null hypothesis), and the symbol Σ represents the summation of values for all i. One then looks up in a table the chi-squared value that corresponds to the chosen p-value and the number of degrees of freedom of the data (that is, the number of categories of the data minus one). If that value from the table is less than the chi-squared value calculated from the data, one can reject the null hypothesis.

The two most common chi-squared tests are the one-variable goodness of fit test and the two-variable test of independence. The one-variable goodness of fit test determines if one variable value is likely or not likely to be within a given distribution. For example, suppose a study was being conducted to measure the volume of soda in cans being filled with soda at a bottling and distribution centre. A one-variable goodness of fit test might be used to determine the likelihood that a randomly selected can of soda has a volume within a fixed volume range--this range refers to all acceptable volumes of soda in cans filled at the centre.

The two-variable test of independence determines whether two variables could be related. For example, a a two-variable test of independence could be used to test whether there is a correlation between the types of books people choose to read and the season of the year when they make their choices.

Ken Stewart