EPID600 (Spring 2013) module XIV. Data analysis and interpretation

In experimental science it is not uncommon for a single study to be taken as a conclusive demonstration of a causal relation, partly because experiments can more readily isolate causal relations and are more amenable to replication. In epidemiologic studies, the myriad of factors that cannot be tightly controlled and the difficulty of replication place a greater emphasis on synthesizing findings from multiple studies - treating the body of literature as a set of data to be analyzed. This task is conducted through systematic reviews, which increasingly include a formal meta-analysis.

A systematic review attempts to identify the universe of studies on a question and to synthesize their findings to establish what has been demonstrated, what remains uncertain, and what methodological concerns need to be addressed. Where multiple studies present quantitative findings on the same relation, a meta-analysis provides a statistical summary of these findings. For example, the risk ratios from multiple studies can be combined into a summary risk ratio, on the theory that each study represents an observation, influenced by sampling variability, of an underlying parameter. By combining the evidence from all of the studies, the summary risk ratio can yield a precise estimate of this underlying parameter. In this way, a group of studies each of which had too little statistical power to observe a statistically significant relation may collectively provide strong support for the existence of the relation and a more precise estimate of its size.

This case study considers two systematic reviews of male circumcision and HIV risk. The first review, by Moses et al. (1994), considered studies available as of the early 1990s and relied on the level of knowledge about the HIV virus that was available at that time. The newer review, by Byakika-Tusiime (2008), considers the studies conducted since the first review, in light of the greater biological understanding of HIV pathogenesis. The newer review also includes a meta-analysis that gives a quantitative summary of the evidence from the newer studies.

An ever-present concern in reading the literature is the possibility of selective publication of studies, a phenomenon often termed "publication bias". Typically, "significant" findings are more likely to be published than are non-significant findings, because journals and authors believe them to be more interesting. Small studies are particularly susceptible to publication bias, since a small study that does not observe a significant relation also provides only limited evidence that the association does not exist, and therefore contributes very little to knowledge. In contrast, the estimate from a large study will have a narrow confidence interval so that even if no association has been observed, the confidence limits provide evidence against the existence of a strong association. Large studies are also more expensive, which is another reason they are more likely to be published even if "negative".

Systematic reviews and meta-analyses endeavor to avoid publication bias by identifying all studies that have been carried out, even unpublished ones. They may also carry out analyses designed to reveal if publication bias is present. The most likely scenario for publication (or reporting) bias is that small studies are reported only when positive, whereas large studies are reported without regard to their findings. Under this scenario, a summary measure of association calculated from large studies will not be biased but a summary measure calculated from small studies will exaggerate the underlying association.

Figure 2 in the review by Byakika-Tusiime provides a "funnel plot", as a graphical check for publication bias. The funnel plot compares the strength of the observed association versus the standard error for its estimate. The standard error is primarily determined by study size and indicates the amount of information available for the estimate (larger and/or more statistically efficient studies produce smaller standard errors, yielding more information). If larger estimates tend to have larger standard errors than do smaller estimates, a reasonable inference is that studies with small estimates and large standard errors have not been reported. (Note that when the measure of association is a ratio measure, its natural log is plotted, which makes the distribution symmetrical.)

1. Many different types of studies have been used to investigate the relation of male circumcision to HIV risk. Do the two reviews (Moses et al. and Byakika-Tusiime) differ in the types of study designs covered? How do these differences affect the strength of the evidence from the two reviews for inferring a causal relation between male circumcision and HIV infection?

2. Both reviews examine male circumcision and HIV risk, but do they define the topic identically?

3. Both papers report findings from Cameron et al., 1989 (Table 1, study B1 in Moses et al.; Table 1 in Byakika-Tusiime). Use the data for Cameron et al. given in the Byakika-Tusiime table to show the calculation of the crude RR shown in each review article.

4. Moses et al discusses results of statistically nonsignificant studies ("studies finding no association", p207,c1); Byakika-Tusiime does not refer specifically to statistical significance. Which of the studies included in the Byakika-Tusiime review would be characterized as statistically nonsignificant (based on the adjusted RR)?

5. Sometimes authors summarize evidence concerning an association by comparing the number of "positive" (significant association was observed) to the number of "negative" (significant association not observed) studies. What is the limitation of this approach?

6. In Moses et al. (p207,c1), the authors describe a potential form of bias in a study from Rwanda where men with HIV may be circumcised post-infection as a form of treatment leading to misclassification of uncircumcised men as circumcised at the time of infection.

What would be the impact of this misclassification on the observed association between male circumcision and HIV?
A cohort study would not be susceptible to bias from this source. How could the bias be avoided in a cross-sectional study?

7. Both studies address the potential which misclassification of exposure (circumcision status) may have on study results (Moses et al. p206,c2, para 2; Byakika-Tusiime p838,c2,para 2). Is it likely that exposure misclassification was nondifferential or differential with respect to the outcome (HIV status)? What is the likely effect of this misclassification on the estimate of relative risk? Is there evidence presented in Byakika-Tusiime that such a bias has occurred?

8. Based on the information in Figure 1 and Table 1 in Byakika-Tusiime, which of these studies provides a more nearly precise estimate of effect, Cameron et al. 1989 or Lavreys et al. 1999? Provide the basis for your answer.

9. Based on the information in Figure 1 in Byakika-Tusiime, how does the effect measure reported by Kumwenda (2001) compare to the overall pooled relative risk?

10. Byakika-Tusiime investigates the potential effect of publication bias on the results of the meta-analysis. According to figure 2 and the text, are there studies "missing" due to publication bias? Is there a particular type of study (in terms of study size or effect measure) systematically "missing"? What effect would the missing studies have on the estimated pooled relative risk and our conclusion regarding causality?

11. Each point in figure 2 of Byakika-Tusiime represents one of the studies shown in figure 1.

Which study citation (author, year) from figure 1 corresponds to the point farthest to the right in figure 2? How can you tell?
Which study citation (author, year) from figure 1 corresponds to the point farthest to the left in figure 2?

12. Both reviews consider each of the Bradford Hill criteria for casual inference. One of the most striking differences between the two reviews is their discussion of the experimental evidence criterion. How has the body of evidence changed in the 15 years between these reviews in this regard? Why do you think Moses et al. held the opinion about experimental evidence which they expressed?

13. The World Health Organization is considering a recommendation to encourage circumcision for all uncircumcised males ages 12-60 years in countries with HIV prevalence above 1% for this demographic group. Do you regard the evidence presented by the articles for this case study as sufficient to support the recommendation? Briefly support your opinion. (See http://www.malecircumcision.org/ for more resources)