Solving Everyday Puzzles Using Regression Analysis |

February 2012

Regression analyses are commonly used in litigation because of their ability to ascertain both liability and damages. However, the ability to communicate this relatively sophisticated statistical analyses to a layperson (e.g., jury) can be challenging to some experts. We present two practical applications that demonstrate regression’s power to answer real world questions/puzzles.

Why Do We Still Receive Printed Phone Books?

In an age when most answers are found quickly online, the phone book has become obsolete. Nevertheless, no fewer than 34 states still require telephone companies to provide a copy of the white pages (a personal directory without the advertising opportunity of the Yellow Pages) to each of its customers every year. The remaining 16 states have created laws eliminating this requirement, though companies in these states are still required to furnish free copies upon customer request. Why have so many states preserved this antiquated practice of distributing white pages by default rather than by customer request?

A 2008 Gallup poll found that only 11 percent of households relied upon the white pages. This percentage has likely declined since. This low usage estimate stands in contrast to the surprisingly large cost of producing and recycling the white pages. WhitePages, Inc. estimates that $17 million of tax-generated funds are spent yearly on recycling fees. The other 165,000 tons annually of white pages (those that are not recycled) end up in landfills. Additionally, the phone companies must pay for the production of several million copies per year, a cost that is likely passed on to consumers. What makes this continued practice all the more puzzling is that it’s hard to think of any powerful lobby that wishes to preserve the mandatory distribution of the white pages; indeed, even WhitePages, Inc. publicly supports changing this practice.

We attempt to solve this puzzle by comparing the 16 states that abolished the mandatory distribution with the 34 states that did not. By comparing these two “types of states”, we can identify differences that might explain why the policy remains in place. We conduct this comparison systematically using a sophisticated statistical technique called a probit model. A probit model is a regression that is specifically designed to examine the relationship between one set of explanatory variables and an outcome variable that takes either a “yes” or “no” value. In our present case, the outcome variable of interest is whether a state abolished the phone book laws (“yes”) or did not (“no”).

Three differences across states come to mind as a way of explaining a particular state’s phone book laws.

Older people, who are less technologically literate, might be more likely to prefer the white pages over the internet. This reasoning suggests that states with a greater percentage of residents over the age of 65 are less likely to repeal the phonebook laws.
States with a lower percent of people using the internet might also keep the law on the books.
States with greater rural populations might be more likely to keep the phone book laws since urban directories are more likely to become outdated quickly.

We empirically test these three theories using the probit model. Based on the model’s results, none of the three explanatory variables discussed above (i.e., percent over age 65, percent internet users, and percent urban) is significantly related to whether a particular state chooses to repeal its phone book laws. This result remains even after one considers other explanatory variables, like the state’s median income and political ideology.

In contrast, we find that state population is strongly and positively correlated with a repeal of the phone book laws. Put differently, large states are much more likely to repeal phone book laws. The reason is a simple application of fixed costs. The costs of repealing the law, use of scarce legislature deliberation time, research costs required to assess the repeal’s impact, etc., are relatively fixed across states. However, the benefit of the repeal—of not printing, recycling and disposing of millions of books— is highly variable across states and grows with larger populations. For example, the cost of repealing this law in Wyoming is not that much lower than doing so in California. The number of legislators that must be convinced of the change is about the same in both large and small population states, and both states have practical limitations as to what the legislature can address each year. However, the total benefit of not printing phone books varies dramatically between the two states—phone companies will have to print 60 times more phone books in California than in Wyoming if the law is not repealed. Unsurprisingly, states where the benefit of the repeal is greatest relative to the costs (i.e. the largest states) are the ones that have repealed the obsolete phone book laws.

One may also test whether the two “type of states” discussed above are statistically different using a chi-square test and/or Fisher Exact test (see our regression primer for more information). These alternative statistical tests also do not identify any differences that explain why the policy remains in place.

2. Do People Drink More Coke as their Country Becomes Richer?

The average person in the world consumes 89 Coca-Cola beverages per year. While this datum might sound impressive, some business analysts have recently argued that this number is rather small, and represents Coke’s tremendous growth potential, particularly within emerging markets. In the United States, an unequivocally well-developed market for Coke, the average person consumes 394 soda beverages. One business analyst argues that, as incomes around the world converge, the “soda gap” will narrow. The logic for this expansion is simple and alluring: as incomes rise, people become more willing to buy luxury goods like soft drinks. It turns out, however, that this causal link is not supported empirically.

In a fairly basic regression model that uses GDP per capita to predict Coke beverage consumption, GDP/capita has no statistically significant effect on consumption after accounting for country-specific factors. (Our primer on basic regression can be found here.) This result isn’t terribly surprising when we consider that societies and cultures vary tremendously in their discretionary spending and dietary habits. For example, consumers in highly developed markets like France drink less than half as many Coke beverages as their U.S. counterparts. Apparently, the French prefer other drinks (perhaps French wine?) in spite of their high discretionary income. Conversely, Mexico, which has a GDP/capita less than one fourth that of the U.S., consumes the most Coke beverages per capita, at 675.

Statistical methods like regressions can be used for summarizing a collection of data, testing the accuracy or attributes of a population, and for drawing inferences about the population being studied. The intelligent use of statistics can save considerable money in compliance auditing, and data analysis for litigation.

Financial Complexity Made Clear
Free Initial Consultation (213) 787-4100

Solving Everyday Puzzles Using Regression Analysis