Statistical Models: Theory and Practice, by David A. Freedman, Cambridge University Press, New York, New York, 2005. 424 pp., $54.99 (paper). ISBN 978-0521671057.
Five out of five stars
While in many ways this is a book of the mathematics used in the construction of statistical models, there are some gems at the end. The first chapter is very educational as it contains explanations of three of the best experiments ever conducted, some of which were natural. A natural experiment is where data is collected and then assigned to treatment or control in a random manner. The data is then analyzed and then processed in order to better understand or to assign an explanatory mathematical model.
The first is the Health Insurance Plan (HIP) study regarding the efficacy of breast cancer treatments. The second is the famous data analysis of the spread of cholera conducted by John Snow in 1855, decades before the emergence of the germ theory of disease. The last is a description of the model of poverty developed by G. U. Yule in the last year of the nineteenth century. Using census data, he developed a model on the causes of poverty. These three examples serve as a primer on how valuable statistical models can be and how they are derived from databases.
The titles of chapters 2 through 8 explain the mathematical contents fairly well. They are:
*) The Regression Line
*) Matrix Algebra
*) Multiple Regression
*) Path Models
*) Maximum Likelihood
*) The Bootstrap
*) Simultaneous Equations
The math is all soundly developed so that the reader will understand how it is used to create the models.
However, I found the reprints in the appendix to be by far the most interesting content. There are four of them and the first is of a paper by James L. Gibson where he examines the sources of political repression during the McCarthy era. Gibson investigates whether the primary source of repression was the political elite or from the mass public.
The second reprint is of a paper by William N. Evans and Robert M. Schwab and is an examination of the relative effectiveness of public and Catholic high schools regarding the students finishing high school and starting college. The third reprint is of a paper by Ronald R. Rindfuss, Larry Bumpass and Craig St. John and is an examination of the relationships between the education that a woman has versus her rates of bearing children. The last reprint is of a paper by Mark Schneider, Paul Teske, Melissa Marschall, Michael Mintrom and Christine Roch. It is an examination of whether the opportunity for parents to select the public schools their students attend leads to their being more involved in school programs such as the PTA.
Reading these papers gives the reader an appreciation for the breadth of use that mathematical and statistical models can be applied to. In a world where people cannot be assigned or manipulated, only the power of statistical modeling can be used to evaluate and explain the consequences of aspects of public policy.
This book was made available for free for review purposes