February 2011

Statisticians describe independence as whether the occurrence of one event or characteristic makes it neither more nor less probable that other event(s) or characteristic(s) occur(s). The chi-square and Fisher exact tests described below are the most widely used tests for evaluating independence of variables.

The classic tool for teaching probability involves placing different colored balls into a container, and then randomly removing the balls. The objective is to determine the probability that a certain color of ball will be removed. If everything were to occur “perfectly”, each removed ball would always bear the same relationship as the starting percentage of each color of ball in the container. Of course, this cannot occur because (i) only one entire ball (not a percentage of a ball) is removed at a time, and (ii) random events could cause one color of ball to be removed with a greater frequency. With small sample sizes, the outcome of a single ball removal dramatically affects the percentage of a particular colored ball that has been pulled.

One could calculate the probability of every possible permutation of balls being pulled. This is possible when small sample sizes are involved, since the possibilities increase geometrically as the number of available options increases. For this reason, this method of performing the tests is usually limited to a two by two table, with small numbers in each quadrant. Such a test is called the Fischer exact test (named after its founder). The Fisher exact test calculates the probability of observing a particular table result or group of related results. The chi-square test approximates the Fisher exact test when larger number of observations must be addressed.

Both the Fisher exact and the chi-square test the hypothesis that the two groups are unrelated.  With larger observations, an approximation of the Fisher exact test is called a chi-square test. However, the chi-square test may yield unreliable results with tables that (i) contain less than 50 observations or (ii) contain less than approximately 5 observations in any cell in the table. In such circumstances, the Fisher exact test usually remains feasible and is reliable.

  1. The chi-square compares the “actual” values (values that actually were observed) with the “statistically expected” values (calculated values based on the sum of actual values that one would expect assuming the hypothesis is valid). Based on differences between the actual values and the statistically expected values, the test result either accepts or rejects the hypothesis with a certain confidence level.
  2. The Fisher exact test calculates the probability of getting deviations as extreme as the “actual” values (values that actually were observed) under the hypothesis that political tendency and largest budget shortfalls are not related. Based on the probability result, the test result either accepts or rejects the hypothesis with a certain confidence level.

A separate article uses these tests in the litigation-related application of whether employment discrimination is occurring (or not occurring).

Fulcrum Inquiry performs statistical analyses in litigation.