Why the data in latest study on smoke free laws and heart attacks supports an effect

Vivian Ho and colleagues recently published “A Nationwide Assessment of the Association of Smoking Bans and Cigarette Taxes With Hospitalizations for Acute Myocardial Infarction, Heart Failure, and Pneumonia” that concluded that “Smoking bans were not associated with acute myocardial infarction or heart failure hospitalizations, but lowered pneumonia hospitalization rates for persons ages 60 to 74 years. Higher cigarette taxes were associated with lower heart failure hospitalizations for all ages and fewer pneumonia hospitalizations for adults aged 60 to 74. Previous studies may have overestimated the relation between smoking bans and hospitalizations and underestimated the effects of cigarette taxes.”
This conclusion differs from our meta-analysis of 45 studies of 33 smoke-free laws, the Institute of Medicine, which was a combined assessment of the studies of individual laws and the biology of how and why changes in secondhand smoke exposure rapidly affect heart disease risk , as well as Vader Weg and colleagues’ well-done national study of the effect on smokefree laws and heart attack and lung disease.
Ho and colleagues actually presented three different ways of analyzing the relationship between smokefree laws and hospitalizations for heart and lung disease three different ways (labeled Models 1, 2, and 3 in their paper).  Their conclusion is based on Models 1 and 2.  In contrast, their Model 3 does show a significant drop in hospital admissions for heart attacks, congestive heart failure, and lung disease (pneumonia).   They dismiss that model and, in fact, use the results to argue that Model 1 is better. 
To understand why Model 3 is actually the best model, you need to dig into the details of what they did.
They analyze the effects of smokefree laws using a “difference in differences” approach, in which they compare how much hospital admissions change in counties with smokefree laws to counties without smokefree laws.
The difference in differences approach is a relative insensitive method because the uncertainty in the difference between two uncertain variables is bigger than the uncertainty in individual variables.  This increased uncertainty makes it harder to detect effects even though they really exist.   When a difference in differences analysis detects an effect this fact (what statisticians call a loss of power) doesn’t matter, but it does raise concerns about drawing negative conclusions.
In addition, the way that Ho and colleagues entered time into the model (as a categorical variable, which allows the effects of time to bounce around) rather than as a continuous variable (which assumes smooth changes), may have obscured an actual effect.
How they quantified smokefree laws
Ho and colleagues considered laws at the county level.  This is important because, as they note, there any many states with no or weak state laws in which there are strong local laws.  (This is much better than an earlier analysis using the same overall statistical approach by Shetty and colleagues that only looked at state laws, which missed the effects of all the local laws.)  To identify counties with and without strong laws, they used the Americans for Nonsmokers’ Rights local ordinance database to classify counties as having or not having strong laws using the following definitions:

  • A county was classified as having a strong law if 75% or more of the people living there were covered by a comprehensive (workplaces, restaurants and bars) smokefree law.
  • A county was classified has having no law if less than 10% of  the people living there were covered by a strong law.

Other counties were excluded, as were places (like California) that had already passed strong laws before 2001.  This process left them with 1901 counties to study, 390 counties that enacted laws during the study period (2001 to 2009) and 1511 comparison counties that had no laws.
Accounting for differences across counties
The whole idea, and power, of analysis like Ho and colleagues did is that, in contrast to the studies of individual laws, is that they consider the whole country (or, in this case, much of the country), so the conclusions are based on a much larger amount of data.  Doing such a national study is difficult, though, because different places are different in terms of the accessibility of health services, how good those services are, underlying smoking rates and other sociodemographic factors, and changes over time that are happening independent of the passage of smokefree laws.  
Ho and colleagues try to deal with these issues using two standard approaches that are widely accepted in economics:

  • They include variables that seek to capture all the other factors that might be causing changes in hospital admissions (number of active physicians, and short-term general hospital beds per capita, the percentage of the overall population in the county enrolled in a health maintenance organization [for ages 18 to 59], the percentage of Medicare beneficiaries enrolled in a Medicare Advantage program [for ages 60 and older], county-level measures of mean household income, the percentage of the population male, in poverty, White, Black, and State-level measures of the percentage of the population who self-reported their health as good or better, reported physical activity in the past month, were overweight or obese, were told by a doctor that they had high cholesterol, or had high blood pressure).
  • They include the time that observations were made.
  • The allow for different average responses in different counties
  • The allow for different rates of change in the different counties

Statistical problems with the way they did it: Multicolinearity
There are two statistical difficulties with this approach is that the multiple regression model has a huge number of possibly redundant independent variables in the analysis (what statisticians call multicollinearity), in this case 1 for the smoking law, 1 for tax, 15 for the demographic variables, 15 for different times, 1901 for the different counties, and another 1901 (or possibly more) variables to allow for the fact that underlying time trends differ across different counties.  The effect of redundant information in this huge number of predictors can make the interpretation of the effects of individual variables (in this case the smokefree law and tax) unreliable.  As Wikipedia explains:

In this situation the coefficient estimates of the multiple regression may change erratically in response to small changes in the model or the data. …, a multiple regression model with correlated predictors can indicate how well the entire bundle of predictors predicts the outcome variable, but it may not give valid results about any individual predictor, or about which predictors are redundant with respect to others.

In plain English, the redundant information between the different predictor variables splits the effect into many pieces, making the assessment of the actual effect of any variable (i.e., smokefree law) unreliable, which almost always has the effect of biasing the analysis against finding an effect even one really exists.
There is a formal assessment of multicollinearity (using something called a variance inflation factor) to detect this problem.  When there is serious multicollinearity, the usual procedure is to drop the redundant variables.  Ho and colleagues, like most economists, did not bother with testing this.
Rather, they took the approach of doing a sensitivity analysis in which they dropped different parts of the model to see what happened, hence the three models in the paper.
Model 1 is the model with the several thousand independent variables listed above in the model.  That model didn’t find an effect of smokefree laws (or, in many cases, taxes) on disease outcomes.
Model 2 dropped the 15 demographic variables (the first bullet above) but kept the thousands of county variables and found essentially the same results as the full model.   This result suggests that multicollinearity with these variables is not obscuring the effect of the smokefree laws and taxes.
In contrast, in Model 3, when they drop all the county variables by keep the demographics from the model while keeping the demographics, they find that the laws (and taxes) are associated with fewer hospital admissions for heart attacks and other outcomes.   This raises the possibility that they are ‘over-correcting” for county differences.  Indeed, by allowing for different changes across counties my guess is that that is exactly what is happening. 
Did Ho and colleagues ignore this issue entirely?
Rather than directly testing for multicollinearity using variance inflation factors, they followed another widely-used qualitative approach, which is to look at the relationship between the independent variables (in this case, smokefree laws and taxes) and some outcome that people know is not associated with changes in smoking, in this case hospitalization for hip fractures.  They found a statistically significant association with hip fractures in Model 3 (but not Models 1 and 2), which they interpreted as evidence that the significant associations with laws (and taxes) in Model 3 was a spurious correlation.
The problem with this logic is that smoking increases the risk of hip fractures
Thus, using the standard that Ho and colleagues applied, their sensitivity analysis leads to the conclusion that Model 3 is superior to Model 1 because Model 1 missed the hip fractures while Model 3 detected it. 
In other words, this paper actually supports the conclusion that smokefree laws and taxes reduce hospitalizations.
The bottom line
With regards to the individual city and state studies provided,  Ho and colleagues are right that one needs to be cautious about interpreting any one study of any one place.    The fact is, however, that a large number of studies using a variety of methodologies in a wide range of places have found remarkably consistent results that smokefree laws reduce hospital admissions and ambulance calls.  The better of these studies do control for underlying time trends and the fact that the study is of a single place avoids the need to add the thousands of control variables that Ho and colleagues included.  The fact that there is a dose-response, with stronger laws being associated with bigger effects, is also important.
That is why, considering the evidence as a whole, we can still be confident that smokefree laws (and tax increases) are rapidly followed by drops in hospital admissions.