Intended for healthcare professionals

Endgames Statistical Question

How to read a funnel plot in a meta-analysis

BMJ 2015; 351 doi: https://doi.org/10.1136/bmj.h4718 (Published 16 September 2015) Cite this as: BMJ 2015;351:h4718
  1. Philip Sedgwick, reader in medical statistics and medical education1,
  2. Louise Marston, senior research statistician2
  1. 1Institute for Medical and Biomedical Education, St George’s, University of London, London, UK
  2. 2Department of Primary Care and Population Health and Priment Clinical Trials Unit, University College London, London
  1. Correspondence to: P Sedgwick p.sedgwick{at}sgul.ac.uk

Researchers undertook a meta-analysis of the effects of home blood pressure monitoring on blood pressure levels. Randomised controlled trials were included if home or “self” monitoring was compared with standard monitoring in the healthcare system. Participants were patients with essential hypertension, followed for two to 36 months. The main outcomes included measurements of systolic and diastolic blood pressure and the achievement of hypertension targets.1

Eighteen trials were eligible for inclusion. When the results of the trials were combined, home monitoring resulted in significantly lower systolic blood pressure than standard monitoring (mean difference 4.2 mm Hg, 95% confidence interval 1.5 to 6.9) and significantly lower diastolic blood pressure (2.4 mm Hg, 1.2 to 3.5). Home monitoring patients were more likely to achieve predetermined targets (relative risk 1.11, 1.00 to 1.11). The researchers presented funnel plots for the outcomes of systolic and diastolic blood pressure (figure). Egger’s test gave P=0.038 for systolic blood pressure and P=0.095 for diastolic blood pressure.

Figure1

Funnel plots for the meta-analysis of the effects on blood pressure of home monitoring compared with standard monitoring in the healthcare system

It was concluded that home monitoring results in lower blood pressure than standard monitoring. Although the difference in blood pressure between the two methods was small it may contribute to an important reduction in vascular complications in the hypertensive population.

Which of the following statements, if any, are true?

  • a) Failure to include in the meta-analysis all of the relevant trials that have been conducted may have been due to reporting bias

  • b) A funnel plot can suggest whether relevant trials were not included in the meta-analysis only as a result of publication bias

  • c) The funnel plots for systolic and diastolic blood pressure indicate that not all of the relevant trials that have been conducted were identified

  • d) The result of Egger’s test indicates that asymmetry exists in the funnel plot for the outcome of systolic blood pressure

Answers

Statements a, c, and d are true, whereas b is false.

The aim of the meta-analysis was to investigate the effects on blood pressure of home monitoring compared with standard monitoring in the healthcare system. The outcomes included systolic and diastolic blood pressure. The purpose of the meta-analysis was to combine the sample estimates of the treatment effects (difference between home and standard monitoring in outcome) to give a total overall estimate of the population parameter for each outcome, thereby reducing a large amount of information to a manageable quantity. The population parameter is the difference in outcome (treatment effect) between home monitoring and standard monitoring that would be observed in the population if both methods were applied to all members. However, the researchers may not have identified all the relevant trials that had been conducted. If so, the total overall estimates produced by the meta-analysis would probably overestimate the population parameters. Failure to include all relevant trials in a meta-analysis can be due to reporting bias (a is true), a collective term for various types of bias.2 Reporting bias occurs when the reporting of research findings is influenced by the nature and direction of trial results, and it includes publication bias, language bias, citation bias, and time lag bias. Failure to include all of the relevant studies that have been conducted in a meta-analysis is often wrongly attributed solely to publication bias.

Failure to include in the meta-analysis all of the relevant trials that have been conducted can be shown graphically using the funnel plot. The funnel plot may show bias resulting from various sources, including all types of reporting bias, but it is not possible to identify which of the reporting biases may be present. Researchers often incorrectly indicate that the purpose of the plot is to detect whether trials were not included in the meta-analysis solely because of publication bias (b is false).

In the above meta-analysis a separate funnel plot was presented for each of the outcomes of systolic and diastolic blood pressure. The funnel plot was a scatter plot of the estimated effect size (mean difference between home monitoring and standard monitoring) of blood pressure plotted on the horizontal axis against the reciprocal of standard error of the estimated effect on the vertical axis for the trials identified. The standard error provides a measure of the precision of the effect size as an estimate of the population parameter.3 Typically, trials with smaller sample sizes produce less precise estimated effects. As sample size increases the precision of the estimated effect increases and the size of the standard error decreases, and therefore the reciprocal of the standard error increases in size. Hence, trials with less precise estimated effects scatter more widely at the bottom of the plot. If the samples for the trials were selected from the population at random, the estimated treatment effects would be expected to scatter around the total overall estimate of the meta-analysis (represented by the vertical line on the plot). As sample size increases, because the precision of the estimated effects increases, the spread of points would be expected to narrow and the scatter plot would resemble a funnel.

Sometimes the standard error, rather than the reciprocal of the standard error, is plotted on the vertical axis. Because trials with larger sample sizes produce more precise estimated effects and therefore smaller standard errors, the vertical axis may be inverted—with zero at the top—so that the scatter of points resembles a funnel. Measures of precision of the estimated effects other than the standard error are sometimes used, including the reciprocal of the sample size or variance of the estimated effect. Sometimes lines are superimposed on a funnel plot to resemble the limits of the predicted funnel shape in the estimated effects, thereby aiding visual interpretation.

If all of the relevant trials that have been conducted were included in a meta-analysis, a funnel plot would be expected to be symmetrical in shape—that is, the points would be scattered in the shape of a funnel centrally around the total overall estimated effect. If not all of the relevant trials were included then the plot would be asymmetrical. Assessment of symmetry in a funnel plot is typically subjective. Any assessment is particularly difficult when the number of trials is small; funnel plots are thought to be unreliable methods of investigating potential bias if the number of studies is less than 10.

Visual inspection of the funnel plots (figure) in the above meta-analysis suggests asymmetry for both systolic and diastolic blood pressure. It therefore seems that not all of the relevant trials that had been conducted were included in the meta-analysis (c is true). For both systolic and diastolic blood pressure, studies seem to be missing at the bottom of the plot towards the left hand side. Such studies would probably be trials with large standard errors and small sample sizes, with a mean blood pressure for standard monitoring that was lower than for home monitoring. However, it is only an assumption that these studies were ever undertaken.

Formal statistical tests exist for assessing asymmetry in a funnel plot, including Egger’s test. The null hypothesis for Egger’s test is that symmetry exists in the funnel plot, with the alternative indicating that asymmetry is present. The P value for Egger’s test was 0.038 for systolic blood pressure and 0.095 for diastolic blood pressure. Hence there was evidence of asymmetry at the 5% level of significance in the funnel plot for systolic blood pressure (d is true) but not for diastolic blood pressure. Although there was a discrepancy between the visual inspection of the funnel plot and Egger’s test result for diastolic blood pressure, the test result should be interpreted in the context of visual inspection of the funnel plot. Sometimes statistical tests for detecting asymmetry in a funnel plot have low statistical power.

As described above, asymmetry in a funnel plot may be caused by reporting bias. However, it can also be the result of poor methodological design in the trials identified—for example, the lack of blinding to treatment allocation, which makes the measurements prone to ascertainment bias.4 Typically, poor methodological design results in estimated treatment effects being spuriously inflated. Poor methodological design is a common problem in trials with small sample sizes and it leads to an absence of studies on one side at the base of the funnel, resulting in asymmetry in the funnel plot. This might explain the asymmetry of the funnel plots in the above meta-analysis.

If, based on the funnel plot, it is suspected that not all relevant trials have been included in a meta-analysis, the effect sizes and standard errors for those studies thought to be missing can be predicted using a method called “trim and fill.” The researchers reported that when potential bias in the identified studies was adjusted for, the total overall estimates were attenuated with a mean difference of 2.2 mm Hg (−0.9 to 5.3) for systolic blood pressure and 1.9 mm Hg (0.6 to 3.2) for diastolic blood pressure. As predicted, failure to identify all the relevant trials resulted in overestimation of the effects on blood pressure of home monitoring compared with standard monitoring.

Notes

Cite this as: BMJ 2015;351:h4718

Footnotes

  • Competing interests: None declared.

References

View Abstract