The Statistical Alchemy of Meta-Analyses

Share with your friends


A recent post by Dr. Wes reminded me of the remarkable article Alvan Feinstein wrote in 1995 “Meta-Analysis: Statistical Alchemy for the 21st Century.”  In a few clearly written pages, the founding father of clinical epidemiology brilliantly identifies the wishful thinking underlying meta-analysis and exposes its methodological fallacies.

Feinstein begins by reminding the reader of the four necessary requirements for acceptable scientific evidence.  Translated to clinical research, these become 1) that the population under investigation be identified reliably (“in a reproducible manner”); 2) that the relevant characteristics be homogeneous; 3) that comparisons performed between subgroups of the population be unbiased (internal validity); 4) that the evidence obtained be extrapolated to a broader population (external validity).

Because meta-analyses necessarily fail on one or more of these requirements, the wished-for results can never produce better information than the trials upon which they are constructed—hence the analogy with alchemy.

With clear prose and dry humor, Feinstein  proceeds to systematically uncover flaw after flaw in the meta-analytic approach.  Among such shortcomings are the inevitable problem of publication bias, the difficulty of reconciling the diverse statistical methods used in the original studies (meta-analyses are usually content just to identify them), and the fact that meta-analyses necessarily obscure the descriptive details of the population under study, dealing a severe blow to external validity.

Incidentally, this last point is almost always a problem with randomized clinical trials to begin with:

The information needed to demarcate a pertinent “clinical resemblance” group, however, has usually been inadvertently ignored or systematically avoided in most randomized trials.

Feinstein suspects this perennial problem in clinical trials and meta-analyses to be a case of cui bono:

The ordinary randomized trial answers questions about the average efficacy of Treatment A vs Treatment B in the group under study.  A meta-analysis answers questions about the average efficacy of those average efficacies or (if the original data are checked) the average efficacy in the pooled large group.

Pharmaceutical companies, regulatory agencies, and public-policy makers may be satisfied with those average results, but practicing clinicians and patients are not.

Furthermore, much of of what happens in real life, such as mid-course changes or additions to the initial therapy, are usually “censored” by meta-analytical methods.  And meta-analyses are ill-equipped for studying “soft” endpoints such as functional performance and quality of life which have added relevance to clinical practice.

Feinstein elaborates on the inconsistent “statistical tactics” used in meta-analyses to report results: a laundry list of possible statistical descriptors inviting ambiguity, and disagreement among statisticians as to the most appropriate theoretical and mathematical model to guide the design of meta-analyses.

But the most common issue, in Feinstein’s view, is the ubiquitous practice of expressing effects of treatments as ratios (eg. relative risk,  odds ratio, or proportionate increment) without direct reference to the underlying event rate.  In fairness, this tendency to ignore absolute rates seems to have somewhat diminished in the last 15 years.  At least when it comes to major clinical trials, mention of absolute benefit or number-needed-to-treat is usually found in the body of the report or in the abstract (though rarely in the conclusion or summary).  But absolute effects are still paid little attention in meta-analyses, where hazard ratios and proportionate increments remain the measuring norm.

Feinstein continues his indictment of meta-analyses by listing common sense principles that are virtually always ignored:  1) in order to draw any meaningful conclusion, variables must be selected if they reflect biologically relevant notions of homogeneity, not because they can be conveniently measured and classified; 2) meta-analyses cannot claim wider applicability because they study more heterogenous populations; 3) if a meta-analysis must hint at causality, it must give particular attention to a consistent effect among in the individual studies.  Inconsistencies of effect cannot be “buried in the statistical agglomeration.”  Yet rare is the meta-analysis that does not precisely include studies going “both ways” before forcefully concluding that a cause and effect relationship was established.

In the concluding pages of the paper, Feinstein comments upon the application of meta-analytic methods to non-randomized observational studies, introducing the subject starkly:

I admire the courage of the group who undertake this assignment, which resembles the attempt of a quadreplegic person trying to climb Mount Everest unaided.

Yet true to his calling as scientist and educator, Feinstein does not limit the discourse to sarcastic comments.  He references numerous ways to improve the quality of clinical studies (many of which he identified in the course of his career) and adds:

Some epidemiologists have urged that the “use of meta-analysis in observational research be abandoned”.  I cannot disagree with the scientific distress that evokes this suggestion, but I do not like scientific censorship, and I realize that such analyses will continue to be done.  Consequently, I believe the best and most productive approach is to develop criteria for scientific quality of individual studies.  If any excellent individual studies can be found or done, they can then be used for a “best-evidence synthesis” or “methodological analysis” that focuses on the few studies that seem unequivocally good.  This selectivity seems much more attractive than combining fruits, vegetables, and everything else into a standard meta-analysis that lacks even the scientific precautions offered by individual results from randomized trials.

If these comments were directed at the likes of the Cochrane Collaborators, we cannot say.  But a few paragraphs later, Feinstein brings up the topic of statistical abuse of power:

The professional apprehension extends into fears of a statistogenic form of censorship for scientific research.  Epidemiologists have worried that meta-analyses may “stifle further studies of controversial topics” [22], and clinical investigators are concerned that Centers for Meta-Analysis may become too officious or even terroristic (“an obstetrical Baader-Mienhof gang” [23]).

As it turned out, Feinstein was quoting prophetic voices in the wilderness.  But fifteen years later it is more than additional research that is at risk of being censored.  Meta-analytic nonsense is now quoted in the mainstream media as factual truth by prominent academics to justify specific forms of health care rationing.  But should we really be surprised if the congressional sausage machine relies on statistical mishmash and academic hodgepodge to plan healthcare into the future?  After all, cui bono?

2 thoughts on “The Statistical Alchemy of Meta-Analyses

    • I’m sure Feinstein bothered a lot of people and was probably marginalized from the field he helped establish.

      Patanalysis….I like it…


Comments are closed.