Abstract

The claim that circumcision reduces the risk of sexually transmitted infections has been repeated so frequently that many believe it is true. A systematic review and meta-analyses were performed on studies of genital discharge syndrome versus genital ulcerative disease, genital discharge syndrome, nonspecific urethritis, gonorrhea, chlamydia, genital ulcerative disease, chancroid, syphilis, herpes simplex virus, human papillomavirus, and contracting a sexually transmitted infection of any type. Chlamydia, gonorrhea, genital herpes, and human papillomavirus are not significantly impacted by circumcision. Syphilis showed mixed results with studies of prevalence suggesting intact men were at great risk and studies of incidence suggesting the opposite. Intact men appear to be of greater risk for genital ulcerative disease while at lower risk for genital discharge syndrome, nonspecific urethritis, genital warts, and the overall risk of any sexually transmitted infection. In studies of general populations, there is no clear or consistent positive impact of circumcision on the risk of individual sexually transmitted infections. Consequently, the prevention of sexually transmitted infections cannot rationally be interpreted as a benefit of circumcision, and any policy of circumcision for the general population to prevent sexually transmitted infections is not supported by the evidence in the medical literature.

1. Background

The earliest report of circumcision status as potential risk factor for sexually transmitted infections (STIs) was published in 1855 by Hutchinson, who noted that in men who were treated for STIs (primarily gonorrhea and syphilis), Jews were less likely to have syphilis [1]. This report is still referenced by circumcision proponents as a validation of their claim that circumcision prevents STIs, but the converse of Hutchinson’s finding, namely that when compared to Gentiles, Jews were at greater risk for gonorrhea, is typically ignored.

The claim of reduction of the risk of STIs to justify neonatal circumcision continues today, often supported by selective bibliographies [212]. When the entire medical literature is reviewed, these claims become difficult to substantiate. The American Academy of Pediatrics, 1999, Task Force on Circumcision concluded that “evidence regarding the relationship of circumcision to STD in general is complex and conflicting.” [13] In 2012, using a selective bibliography, consistent with the practices of circumcision proponents, the American Academy of Pediatrics concluded that “evaluation of current evidence indicates that the health benefits of newborn male circumcision outweigh the risks; furthermore, the benefits of newborn male circumcision justify access to this procedure for families who choose it. Specific benefits from male circumcision were identified for the prevention of urinary tract infections, acquisition of HIV, transmission of some STIs, and penile cancer.” [12] Within the body of the statement, the committee admitted that they were unable to precisely measure the benefits of infant circumcision and unable to quantify the risks. The committee completed its review of the medical literature in April 2010 and published its findings in August 2012.

To shed some light on this contentious issue and whether the conclusion reached by the committee reflects the information available in the medical literature, this paper will provide a systematic review of the association between male circumcision status and the risk for individual types of STIs (other than human immunodeficiency virus (HIV)) and the overall risk for any STI. While a number of the review articles and systematic reviews of the association between male circumcision and individual types of STIs have been published [1421], many of these need updating, while other have methodological shortcomings. This is also the first systematic review to explore the overall risk of contracting any STI.

2. Methods

The recommendations of Stroup et al. for the meta-analysis of observational studies were followed [22]. Articles were identified using a MEDLINE search and a review of references in published articles. A MEDLINE search using PubMed was undertaken on December 3, 2012. “Circumcision” was used as a key word, which identified 5472 articles. Inclusion criteria included cohort studies, cross-sectional studies, and case-control studies. The individual STIs included genital discharge syndrome (identified in studies as a generic term for gonorrhea, genital infections with Chlamydia trachomatis, and nonspecific (nongonococcal) urethritis in which the primary symptom was a urethral discharge) versus genital ulcerative disease (identified in studies as a generic term for syphilis, genital herpes, chancroid, and other genital ulcers noted on physical examination), genital discharge syndrome (GDS), nonspecific or nongonococcal urethritis (NSU), gonorrhea, genital infections with Chlamydia trachomatis, genital ulcerative disease (GUD), chancroid, syphilis, genital herpes or serology for herpes simplex virus type 2 (HSV), genital human papillomavirus (HPV) infections, and an STI of any type. Studies were also identified by reviewing references in published articles. For inclusion, publications needed to be in a peer-reviewed journal or government publication and present data on the circumcision status of males both with and without a specific STI or an STI in general. Studies primarily of men having sex with men or HIV-infected men were excluded. Within a study, identifiable men having sex with men and HIV-infected men were excluded from analysis, while heterosexual and HIV-negative men were included.

Articles meeting the inclusion criteria were read to determine the number of circumcised men with the illness, the number of circumcised men without the illness, the number of intact men with the illness, and the number of intact men without the illness. The primary analysis was performed using raw data, when available, for the published studies. In some cases, the raw data were obtained through back calculation with the information available in the article. Where raw data were not available, reported odds ratios, relative risks, and confidence intervals were used.

When distinct strata of the subjects within a study showed differing outcomes, each strata were considered separately in calculating the summary effect.

When data from the same population were published in one or more publications, the study in which the data reported the outcome of interest as a primary result or the most recent report were used.

Analyses of studies assessing disease incidence were conducted separately from studies of disease prevalence.

The impact of the type of study population was determined by separating the studies into those studying high-risk populations, such as attendees of sexually transmitted disease clinics and long-distance truck drivers in Africa, and those studying general populations. The impact of circumcision prevalence in the study population on the association between circumcision status and the prevalence of the various STIs was assessed using meta-regression.

Several studies meeting the inclusion criteria contained obvious forms of differential bias. A number of methods were employed to minimize the bias. Several older studies had inappropriate control groups [2325]. For example, Hand used men without any exposure to STIs as controls [25]. In an attempt to control for exposure to STIs, men with a particular STI were compared to all men presenting for evaluation for the possibility of an STI.

The three randomized clinical trials of adult male circumcision in Africa failed to adjust for lead-time bias. Men in these trials who were assigned to immediate circumcision were instructed to either not engage in sexual activity or use condoms with all sexual contacts until the circumcision healed (approximately, from 4 to 6 weeks). Analyses that included these trials were conducted with the reported data and with the data adjusted for a six-week lead-time bias.

Other adjustments were needed specifically for the studies of HPV. Studies of the prevalence of genital HPV infections were separated into those identifying clinical infections with genital warts and those with diagnosis by culture, serology, biopsy, or polymerase chain reaction. Several studies reported separate data for all HPV infections and for infections with high-risk HPV that are potentially oncogenic. Consequently, two separate analyses were run on the latter group. In both analyses, the data from studies reporting only one set of data were used. In the first analysis, the data on all HPV infections were used, while the second analysis used the data on infections with high-risk HPV.

Previous analyses have found that the studies of HPV were prone to two forms of bias [16, 2628]. The first was sampling bias. Several studies have found that circumcised men are more likely to have genital warts or have positive lesions or positive swabs on the penile shaft than intact men [2935]. Consequently, studies that sampled only the glans or the urethra would underestimate the incidence and prevalence of HPV infection in circumcised males.

For example, in the study published by VanBuskirk et al., if only the glans is sampled, only 66.1% of the intact men with genital HPV would be identified, while only 45.2% of the circumcised men with genital HPV would be identified [32]. To adjust for the impact of this sampling bias, separate analyses were performed by multiplying the number of infections identified in studies that only sampled the glans by 1.514 in intact males and 2.212 in circumcised males.

The second is misclassification bias. Studies that rely on the patient report of circumcision status can often inaccurately identify the circumcision status of the participants. This has also been found to be a significant factor in previous analyses of HPV infections [16, 27, 28]. Finally, a separate analysis was conducted of studies of the prevalence of high-risk HPV in which the circumcision status of males was determined by physical examination and HPV was diagnosed by either serology or culture, biopsy, or polymerase chain reaction, with multiple site sampling including the shaft of the penis.

In one study, two testing methods for syphilis were used: the RPR results were used in this analysis [36].

2.1. Statistical Methods

For studies of disease prevalence, a general variance-based random-effects model was performed using each study’s exact odds ratios (Proc-LogXact, version 5.0, Cytel Software Corporation, Cambridge, MA) as described previously [16]. DerSimonian and Laird random-effects summary results and between-study heterogeneity were calculated using the general variance-based method [37].

Poisson regression was used to assess studies of disease incidence. Fixed-effects summary results were calculated using Poisson regression. If between-study heterogeneity was significant ( ), random-effects summary results were calculated using the general variance-based method [37].

Sensitivity analyses of prevalence data for type of study population were performed through separate analyses for each population type. The impact of the type of study population, performance of a study in Africa, the prevalence of circumcision in the study population, and, for HPV, the sampling only the glans of the penis and determination of circumcision by physical examination was estimated using meta-regression [38].

To test for potential outliers, the dataset from each publication was individually excluded from the analysis to measure the impact on the chi-square measure of between-study heterogeneity. The exclusion of a study would be justified by a reduction of the between-study heterogeneity chi-square by a statistically significant amount (e.g., for one degree of freedom, a change in the chi-square value of more than 3.84). Sensitivity analysis was performed with each of these studies excluded and with the two most outlying studies excluded.

Publication bias was assessed using funnel graphs and linear regression analysis as described by Egger and associates [39], funnel plot regression as described by Macaskill et al. [40], and the adjusted rank correlation test described by Begg and Mazumdar [41]. Adjustment for publication bias was performed using the “trim and fill” method described by Duvall and Tweedie [42, 43]. Poisson regression and meta-regression were performed using SAS version 8.02 (SAS Institute, Cary, NC).

3. Results

3.1. Search Results

The MEDLINE search identified 91 articles meeting the inclusion criteria. Of these, several reported on redundant study populations [4455]. Twenty-one studies were identified through searches of bibliographies [1, 23, 25, 5673]. Several studies had collected the data that would have met the inclusion criteria but did not report their results in a manner to include them in the analyses [70, 7481]. The study by Rakwar et al. deserves special comment [70]. While this study was focused primarily on HIV infections, it also collected data on circumcision status and the prevalence and the incidence of GUD, GDS, chlamydia, gonorrhea, syphilis, HSV, genital warts, and chancroid. It did not include the results of these diseases by circumcision status. In a meta-analysis by Weiss et al., this study’s results for chancroid are reported, but the study’s results for HSV and syphilis are not [15].

The characteristics of the studies included for analysis and the types of STIs they studied are listed in Table 1. There were five studies that compared prevalence rates of circumcision in those with GUD with those with GDS. In the study by Nasio et al., only men who were not HIV infected were included. There were ten studies that documented prevalence rates of GDS. There was one study that documented incidence rates of GDS [82]. Twelve studies documented the prevalence of NSU. Three studies addressed the incidence and fourteen studies addressed the prevalence of genital Chlamydia trachomatis. Of the studies addressing gonorrhea, three studies looked at incidence and twenty-two looked at the prevalence. Two studies looked at the incidence of GUD, while twelve looked at the prevalence. For syphilis, there were three studies looking at incidence and twenty-seven studies looking at prevalence. For HSV, four studies looked at incidence and twenty-seven at prevalence. All four studies of chancroid documented prevalence. Of the studies of genital HPV, fourteen documented the prevalence of visible genital warts, seven documented the incidence, and twenty-one documented the prevalence of HPV infections. Some studies have looked at clearance rates of HPV from the penis, but these were not part of this analysis [35, 55, 83, 84]. Four studies looked at the incidence of contraction of any STI versus no STI, and twenty looked at prevalence.

3.2. Meta-Analysis Results

The results of the analyses of incidence data are shown in Table 2. Of note, when adjusted for lead-time bias, no statistically significant differences were noted in GDS, gonorrhea, syphilis, or any STI. GUD was significantly more common in intact men. For chlamydia, HSV, and HPV, intact men were at higher risk, but when adjusted for lead-time bias, the differences were no longer statistically significant. There was no evidence of significant between-study heterogeneity for any of these analyses.

The results of the analyses of prevalence data are shown in Tables 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, and 14. All of the analyses showed significant between-study heterogeneity. Intact men were found to be at significantly greater risk for GUD versus GDS, GUD, syphilis, and any HPV, while at significantly lower risk for NSU and genital warts. No significant differences were seen for chlamydia, gonorrhea, HSV, chancroid, or high-risk HPV. There was a trend for intact men be a lower overall risk for an STI that was statistically significant when a clear outlier studies is removed [85].

3.3. Outliers

The results of testing an individual publication’s impact on between-study heterogeneity are shown in Table 15. Identifying and excluding the two studies with greatest impact on between-study heterogeneity was able to bring the overall between-study heterogeneity to within an acceptable range ( ) for GUD versus GDS, GDS, chlamydia, GUD, chancroid, and HPV but not NSU, gonorrhea, syphilis, HSV, genital warts, or any STI. Exclusion of studies did not change the conclusions of summary effect with only a few exceptions. In the analysis of genital warts, the removal of either the study by Oriel [29] or Wilson [23] made the negative association between intact men and genital warts statistically significant. A similar impact was seen in with HPV. In the analysis of any type of HPV, exclusion of the study by Vaccarella et al. brought the between-study heterogeneity within an acceptable range [86]. In the analysis of the prevalence of chancroid, exclusion of the study by Hart [69] brought the between-study heterogeneity to within an acceptable range and reversed the trend in the association. The most notable outlier was in the analysis of any STI, where the exclusion of the study by Langeni [85] resulted in a drop in the between-study heterogeneity chi-square of 203.41 ( ) from 303.00 to 99.59. Consequently, two analyses of the prevalence of any sexual transmitted infections were conducted: one with and one without this study.

3.4. Sensitivity Analysis

Sensitivity analyses were not performed for the evaluation of risk of GUD versus GDS or chancroid because of the small number of studies. Sensitivity analysis comparing disease prevalence in studies of high-risk populations and general population is shown in Table 16. Of note, the association between intact men and the various STIs was consistently stronger in studies of high-risk populations. Intact men in general populations were at statistically significant lower risk of disease for GDS, NSU, genital warts, and any STIs with Langeni [85] excluded and at no statistically significant difference of risk for chlamydia, gonorrhea, syphilis, HSV, HPV, and any STIs with Langeni [85] included. Intact men were at greater risk of GUD in both general and high-risk populations. In high-risk populations, intact men were at significantly greater risk for GDS and syphilis and at no significant difference in risk for NSU, Chlamydia, gonorrhea, HSV, genital warts, HPV, or any STI. Between-study heterogeneity was within an acceptable range ( ) for high-risk populations for GDS and chlamydia and for general populations for gonorrhea and genital warts.

3.5. Meta-Regression Analysis

Meta-regression was not performed for the evaluation of risk of GUD versus GDS, chancroid, or the studies of disease incidence because of the small number of studies.

3.5.1. High-Risk versus General Populations

Meta-regression methods found that the population type (general versus high risk) was notable ( ) for studies assessing the prevalence of GDS ( and   ), NSU ( and   ), and GUD ( and ). For the GDS studies, the summary effects were OR = 0.78 (95% CI = 0.62–0.96) for the general populations and OR = 1.11 (95% CI = 0.87–1.40) for the high-risk populations. For the NSU studies, the summary effects were OR = 0.61 (95% CI = 0.43–0.85) and OR = 0.85 (95% CI = 0.67–1.09) and for GUD 1.37 (95% CI = 1.00–1.85) and (95% CI = 1.50–2.10) for general and high-risk populations, respectively.

No significant differences were seen for chlamydia, gonorrhea, syphilis, HSV, genital warts, HPV, or any STI (either with or without the study by Langeni [85] was included).

3.5.2. Studies in Africa

Meta-regression methods found that having a study carried out in Africa as opposed to elsewhere was notable ( ) for studies assessing the prevalence of GDS ( and ), chlamydia ( and ), GUD ( and ), and any type of HPV ( and ). For GDS, the summary odds ration in Africa was 0.85 (95% CI = 0.70–0.97) and 1.19 (95% CI = 0.80–1.78) outside Africa. For chlamydia, the summary odds ratio in Africa was 0.63 (95% CI = 0.35–1.12), while it was 1.0098 (95% CI = 0.85–1.21) outside of Africa. For GUD, the summary odds ratio inside Africa was 1.45 (95% CI = 1.24–1.70), while it was 1.95 (95% CI = 1.74–2.18) outside of Africa. For any HPV type, the summary odds ratios are 2.13 (95% CI = 1.05–4.29) and 1.18 (95% CI = 0.9919–1.41) inside and outside of Africa, respectively. The studies of HSV showed a trend to having a greater association between HSV and intact men inside of Africa ( and ) with African studies having a summary odds ratio of 1.35 (95% CI = 1.04–1.75) and non-African studies a summary odds ratio of 1.06 (95% CI = 0.87–1.29).

No significant difference was seen in with NSU, gonorrhea, syphilis, genital warts, high-risk HPV, or any STI.

3.5.3. Circumcision Prevalence

A statistically significant impact of circumcision prevalence on the natural logarithm of the odds ratio of the association between circumcision status and prevalence of disease was found for GDS ( and ), gonorrhea ( and ), GUD ( and ), syphilis ( and ), and genital warts ( and ). The impact of circumcision prevalence on disease risk is shown in Figures 15. The odds ratios increased with circumcision prevalence for all diseases, except for the opposite association with genital warts. Circumcision prevalence was not a statistically significant factor for the other diseases.

3.5.4. Combinations of Factors

For GUD, population type, a study being performed in Africa, and circumcision prevalence were all statistically significant factors. When multiple factors are added to the regression model, only a study being performed in Africa was statistically significant. A model with a general population performed in Africa found a random effects summary odds ratio of 1.33 (95% CI = 1.02–1.71).

3.5.5. Studies of HPV

With the studies of any type of HPV, sampling only the glans trended toward being a factor ( and ). Glans only studies had a summary odds ratio of 1.82 (95% CI = 1.05–3.14), while studies with complete sampling had a summary odds ratio of 1.17 (95% CI = 0.98–1.40). Patient report of circumcision status was a statistically significant factor ( and ) with studies relying on physical examination to determine circumcision status having a summary odds ratio of 1.14 (95% CI = 0.97–1.35) and studies with a reliance on patient report as summary odds ratio of 2.11 (95% CI = 1.24–3.59). When both factors are included in a multivariate model (sampling and ; physical examination and ), the summary odds ratio for complete sampling of the penis combined with circumcision status determined by physical examination is 1.08 (95% CI = 0.93–1.24), and for sampling only the glans combined with determining circumcision status by patient report is 3.21 (95% CI = 1.62–6.36).

With high-risk HPV studies, sampling only the glans trended toward being a factor ( and ). Studies that sample only the glans had a summary odds ratio of 1.86 (95% CI = 0.9964–3.46), while studies with complete sampling had a summary odds ratio of 1.10 (95% CI = 0.88–1.37). Patient report of circumcision status was statistically significant ( and ) with physical examination studies having a summary odds ratio of 1.08 (95% CI = 0.88–1.32) and patient report studies having a summary odds ratio of 2.16 (95% CI = 1.18–3.99). When both factors are included in model (sampling and ; physical examination and ), the summary odds ratio for complete sampling combined with physical examination determination of circumcision status is 1.01 (95% CI = 0.84–1.22), while the summary odds ratio with sampling only the glans combined with depending on patient report to determine circumcision status is 3.45 (95% CI = 1.60–7.42).

3.6. Publication Bias

A funnel graph, which plots the precision (the inverse of variance) on the -axis and the natural logarithm of the odds ratio on the -axis, should have a shape like an inverted funnel with the largest study representing the apex of the inverted funnel. If there is a paucity of studies in the left lower portion of the inverted funnel and a cluster of studies in the right lower portion, that would be suggestive of publication bias. Funnel graphs for the various STIs are shown in Figures 616. Paucity in the left lower portion is seen in the funnel graphs for NSU (Figure 7), GUD (Figure 10), syphilis (Figure 11), genital warts (Figure 13), and HPV (Figures 14 and 15). Large studies that appear to be outliers (odds ratios greater than expected) are noted in funnel graphs for GDS (Figure 6) [87], HSV (Figure 12) [88], and any STI (Figure 16) [85]. The funnel graph for chlamydia shows an outlier in left lower portion (Figure 8) [89].

Methods to determine the presence of publication bias use a value threshold of 0.10 for significance. Results of evaluation for publication bias using for each STI are shown in Table 17. Of the six measures of publication bias, none were positive for GUD, syphilis, and genital warts; one was positive for chlamydia, gonorrhea, HSV, and any STI with the study by Langeni [85] excluded; three were positive for NSU and HPV; and four were positive for GDS and any STI with Langeni [85] included.

3.6.1. Trim and Fill

Using the “trim and fill” technique, no adjustments were needed for studies of the prevalence of GDS, NSU, gonorrhea, syphilis, HSV, HPV (in which there was complete sampling and circumcision status that was determined by physical examination), and any STI.

For genital infections with chlamydia, the “trim and fill” technique indicated two unpublished studies. By adding these two studies, the summary odds ratio, adjusted for publication bias, is 0.88 (95% CI = 0.69–1.11). The addition of one study was indicated for GUD, the addition of which yield a summary odds ratio, adjusted for publication, of 1.64 (95% CI = 1.34–2.01). For genital warts, the technique indicated two unpublished studies, whose addition would yield a summary odds ratio, adjusted for publication bias, of 0.76 (95% CI = 0.60–0.97). For both analyses of the prevalence of HPV infections (any type and high-risk types), two unpublished studies would be expected. The summary odds ratio, adjusted for publication, would be 1.19 (95% CI = 0.97–1.46) for any HPV types and 1.11 (95% CI = 0.88–1.39) for using high-risk HPV types.

4. Discussion

4.1. Genital Ulcerative Disease versus Genital Discharge Syndrome

The comparisons of men diagnosed with GUD and GDS are consistent with findings that intact men are more prone to GUD and circumcised men are more prone to GDS. Consequently, there is no surprise here.

4.2. Genital Discharge Syndrome

The prevalence of GDS shows a moderate trend toward being less common in intact men (OR = 0.89 and 95% CI = 0.73–1.09). The finding in general populations is statistically significant (OR = 0.77 and 95% CI = 0.59–0.99). The only study of incidence found no significant difference [82]. Circumcision prevalence in the population studied had a significant association with the odds ratio measured for prevalence of GDS (Figure 1). The funnel graph (Figure 6) indicates that the study by Warner et al., [87] to be an outlier. This is confirmed when this study is excluded from the analysis and the summary odds ratio drops to 0.85 and the finding approaches statistical significance (95% CI = 0.70–1.03) (Table 15). This study also may explain why four of the measures of publication bias were positive. While this diagnosis is based on clinical findings, the lack of association with intact men and GDS is consistent with what is seen with NSU.

4.3. Nonspecific (Nongonococcal) Urethritis

The prevalence of NSU is significantly lower in intact males (OR = 0.76 and 95% CI = 0.63–0.92). Between-study heterogeneity is a concern as five of the twelve studies contributed significantly to the between-study heterogeneity, but exclusion of any these studies did not change the significance of this finding (Table 15). Three publication bias measures were positive, which is consistent with the paucity of studies in lower left portion of the funnel graph (Figure 7). The “trim and fill” method, however, found that no studies were needed to adjust for publication bias.

Other than the problems with between-study heterogeneity and these analysis indicates a fairly robust, significant association between a lower prevalence of NSU in intact males.

4.4. Chlamydia

There was no significance difference in the prevalence of genital chlamydia infections but a trend toward a lower prevalence in intact men. None of the studies of incidence found a significant difference (whether adjusted for lead-time bias or not). When studies of incidence are adjusted for lead-time bias and combined, there is no significant association.

Only two outliers were identified (Table 15). When they are excluded from the analysis, the summary odds ratio is 0.93 (95% CI = 0.87–1.00) and the between-study heterogeneity resolves (chi-square = 7.75 (df = 11) and ). Meta-regression showed a trend toward a lower association between the prevalence of chlamydia and intact men in African studies ( ). In African studies, the summary odds ratio was 0.63 (95% CI = 0.35–1.12).

The funnel graph indicates a clear outlier (Figure 8) [89]. Two of the measures of publication were positive, and the “trim and fill” method added two studies to the left lower portion of the graph, giving a summary odds ratio, adjusted for publication bias of 0.87 (95% CI = 0.69–1.11).

The analysis indicates a trend toward a lower prevalence of chlamydia in intact men, especially in Africa and in the general population. No difference was seen in the incidence studies.

4.5. Gonorrhea

No significant association between the incidence or the prevalence of gonorrhea and circumcision status of males was found. This was seen in both high-risk and general populations. There was significant between-study heterogeneity, and five potential outliers were identified. The prevalence of circumcision in the population studied was significantly associated with odds ratio reported in the study ( ) (Figure 2). As circumcision prevalence approached the extremes, the summary odds ratio in population with a 0% circumcision rate would be estimated at 0.68 (95% CI = 0.49–0.96), while a population with a 100% circumcision rate, the summary odds ratio would be estimated at 1.72 (95% CI = 1.16–2.55).

Only one measure of publication bias was positive, and the funnel graph (Figure 9) looks symmetric. No studies were added using the “trim and fill” approach.

The data indicate that the incidence and the prevalence of gonorrhea are not affected by circumcision status as much as by the prevalence of circumcision within the community studied.

4.6. Genital Ulcerative Disease

Incidence and prevalence of GUD were consistently positively associated with intact men, even when subjected to sensitivity analysis and meta-regression. Between-study heterogeneity was significant even after adjusting for four “outlying” studies. Meta-regression found significant associations for population type, whether studies were performed in Africa and circumcision prevalence in the populations studied (Figure 3). When combined in a multivariate analysis, only a study being performed in Africa was a significant factor.

In the funnel graph, there is a study in the right lower portion that is not balanced in the left lower portion (Figure 10). None of the publication bias measures were positive, yet the “trim and fill” process added one study making summary odds ratio, adjusted for publication bias, of 1.63 (95% CI = 1.34–2.01).

GUD, which is more commonly seen in developing countries, has a propensity for mucosal surfaces. Most of the studies of HSV have looked at seroconversion rates for herpes simplex virus type 2. This will not capture recurrences. Since GUD is a clinical measure that includes HSV recurrences and ulcers for which no causative agent can be identified; one would expect a higher rate in intact men because more than half of the mucosal surface of the penis is removed with circumcision. Herpes simplex viruses, including type 1 and type 2, also have a propensity for junctional tissues. This is why cold sores recur in the corner of the mouth and on the facial lips. If one were to amputate facial lips, one would see a lower recurrence rate of herpes simplex virus type 1. To follow this analogy, circumcision removes all of the junctional tissue of the prepuce [90, 91], so this may impact HSV recurrences. While this is a consistent finding, it is difficult to know what the public health impact is in regions where the prevalence of GUD is low.

4.7. Syphilis

The data on syphilis present quite a farrago. On the one hand, there is a positive association between the prevalence of syphilis and intact genitalia, but, on the other hand, the incidence of syphilis, even before adjusting for lead-time bias, indicates a negative, albeit nonsignificant, association. The positive association is seen primarily in populations at high risk for acquiring STIs, while in the prevalence in general populations found no statistically significant difference (depending on the calculation method used such as general variance-based method: OR = 1.23 and 95% CI = 1.0064–1.49; meta-regression method: OR = 1.25 and 95% CI = 0.96–1.60). Seven prevalence studies had statistically significant contributions to the between-study heterogeneity. The between-study heterogeneity improves when only studies of general populations are considered but does not resolve completely. The prevalence of syphilis by circumcision status is also significantly associated with the prevalence of circumcision in the population studied (Figure 4).

The funnel graph clearly looks asymmetric (Figure 11), but none of the measures of publication bias nor the “trim and fill” method identified this.

With the mixed results between incidence and prevalence, the lack of a significant association in general populations, the number of studies that could be considered outliers, the significant association with circumcision prevalence in the population studied, and the asymmetry of the funnel graph, one cannot accurately conclude that the risk of syphilis is significantly associated with circumcision status.

4.8. Genital Herpes/Herpes Simplex Virus Type 2

While there was a trend for the prevalence of HSV to be greater in intact men, the association was not statistically significant. When adjusted for lead-time bias, none of the studies that looked at the incidence of herpes simplex virus type 2 found a statistically significant association. When the studies are combined, there is no statistically significant association but a slight trend toward higher risk for intact men.

There was significant between-study heterogeneity for the prevalence studies. Six outliers were identified. Exclusion of these studies individually and the two largest contributors did not bring the between-study heterogeneity within an acceptable range and did not yield a summary effect that was statistically significant. In both high-risk and general populations, the summary effect was not statistically significant, and between-study heterogeneity remained significant. Using meta-regression, there was a trend ( ) that odds ratios were higher in African studies.

The funnel graph indicates some asymmetry with a cluster of studies in the lower right portion that is not balanced on the left side (Figure 12). Two of the measures of publication bias were positive, but no adjustments were indicated using the “trim and fill” method.

While there is a trend toward higher incidence and prevalence of HSV in intact men, the finding is persistently not statistically significant despite a number of adjustments. The high level of between-study heterogeneity, which could not be shed despite several attempts, presents a problem in making any recommendation regarding circumcision’s impact on HSV.

An earlier meta-analysis of HSV prevalence and circumcision had failed to include two of the populations included in this analysis [15, 65]. This is strange considering that the same person was the lead author of both studies.

As an aside, there have been a number of systemic and fatal herpetic infections reported following ritual circumcision in which the person performing the circumcision puts his mouth around the penis after the foreskin has been amputated [9295]. Instead of banning the practice, the New York City Health Department has asked parents to sign off on this practice. Orthodox Jews in New York City are currently fighting this ruling.

4.9. Chancroid

The paucity of studies, the reliance on clinical identification in all but one of these studies, and the high degree of between-study heterogeneity make it difficult to comment on the impact of circumcision on this illness, yet the lack of good evidence did not keep the 2012 AAP Task Force from including a discussion of circumcision’s impact on the prevalence of chancroid [12], which is relatively uncommon in developing nations and extremely rare in developed nations. The degree of between-study heterogeneity is significant and can be almost completely attributed to one study [69]. Exclusion of this study brought the between-study heterogeneity within an acceptable range ( ). When other outliers were excluded from analysis along with the study by Hart [69], the further reduction in the between-study heterogeneity chi-square, compared to excluding only Hart’s study, was not statistically significant.

The data do not support the claim by Weiss et al. that “circumcised men are at lower risk of chancroid” [15]. There have been no new publications on the impact of circumcision on the prevalence of chancroid since 2006. The difference between the analyses is that Weiss et al. included several studies in their meta-analysis that were not strictly studies of chancroid. As I have noted previously [96], three of the studies included in their analysis of chancroid did not meet basic inclusion criteria because they lacked a direct comparison between intact and circumcision men for a specific diagnosis of chancroid [59, 97, 98]. In two of the studies, men with genital ulcers were presumed to have chancroid but never tested for it [97, 98], while the third study tested the men presumed to have chancroid and found that 31.4% had herpes simplex virus type 2 and only 22.9% had a positive culture for Haemophilus ducreyi, the causative agent of chancroid [59]. When these studies are appropriately assigned to an analysis of the prevalence of GUD and excluded from an analysis of the prevalence of chancroid, any imagined association between circumcision status and prevalence of chancroid evaporates.

4.10. Genital Warts

The prevalence of genital warts has a strong trend towards being lower in intact males. In general populations, the association is statistically significant (OR = 0.78 and 95% CI = 0.63–0.96) and did not have evidence of between-study heterogeneity (chi-square = 8.61 (df = 6) and ). Three studies were identified as potential outliers; removal of the two studies with the greatest impact on between-study heterogeneity brought the between-study heterogeneity near the acceptable range ( ). Using meta-regression, circumcision prevalence in the population studied was negatively associated with the reported odds ratios ( ) (Figure 5).

The funnel graph indicates some paucity of studies in the left lower region (Figure 13). None of the measures of publication bias were positive, yet the “trim and fill” calculations indicated that there were two studies missing in the left lower portion of the funnel graph. Adjusting for publication bias, the summary odds ratio was 0.76 (95% CI = 0.60–0.97).

The evidence in favor of a lower prevalence of genital warts in intact males is supported by the finding in studies of general populations, which were surprisingly free of between-study heterogeneity and the summary result after adjusting for publication bias. The odds ratios in studies were, however, impacted by the prevalence of circumcision in the population studied.

4.11. Human Papillomavirus

A systematic review of the incidence and prevalence of genital HPV infections as they relate to circumcision status in males is fraught with a variety of pitfalls. This may explain why several systematic reviews with meta-analysis have been published with inconsistent results [16, 18, 20]. HPV has many subtypes, some of which have been demonstrated to be oncogenic, while others are benign and self-limited infections. The oncogenic types have been strongly linked to cervical cancer in women and may be responsible for about half of the cases of penile cancer in men. Some studies reported their results for HPV infections without specifying the types of HPV identified, some reported only infections with oncogenic HPV, and some studies reported results on all HPV infections and also infections with oncogenic HPV. Consequently, two analyses were run (any HPV and high-risk HPV). Since oncogenic HPV is more concerning clinically, the second analysis may be the more relevant of the two. In the analysis that focused on high-risk HPV, there was no significant difference in the prevalence by circumcision status.

Previous analyses have found that sampling bias and patient report of circumcision status significantly effect the odds ratio reported in a study [16, 2628]. For this reason, a third analysis (selective HPV) was run on the studies of prevalence in the second analysis (high-risk HPV) in which studies with the potential for sampling bias and misclassification bias were excluded.

Finally, the two randomized clinical trials that reported their results on HPV infection both failed to adjust for sampling only the glans and to adjust for lead-time bias.

The incidence of HPV infections was barely statistically significantly different based on circumcision status before adjustment for sampling bias and lead-time bias (RR = 1.16 and 95% CI = 1.0097–1.34). After adjustment for these sources of bias, the relative risk is 0.96 (95% CI = 0.85–1.09).

Prevalence of HPV in the first analysis (any HPV) was higher in intact men (OR = 1.24 and 95% CI = 1.02–1.50), but the statistical significance of this finding is tenuous. When sensitivity analysis comparing studies of high-risk populations and studies of general populations, the result in neither group is statistically significant. When two of the identified “outliers” are individually excluded from the analysis, the results are not statistically significant.

When meta-regression is used to adjust for sampling bias, and misclassification bias the summary odds ratio is 1.08 (95% CI = 0.93–1.24).

The funnel graph for the first analysis of HPV (any HPV) shows a clear paucity of studies in the left lower portion (Figure 14). Not surprisingly, three of the measures of publication bias were positive, and the “trim and fill” method added two studies. The summary odds ratio adjusting for publication bias was 1.19 (95% CI = 0.97–1.46).

Prevalence of HPV in the second analysis (high-risk HPV) was not significantly different on the basis of circumcision status (OR = 1.17 and 95% CI = 0.94–1.45). Significant difference was found in neither high-risk populations nor general populations.

Five outliers were identified. Excluding them individually from the analysis or excluding the two studies that contributed the most to between-study heterogeneity did not result in providing evidence of statistically significant difference. Excluding the two studies did bring between-study heterogeneity to within an acceptable range ( ). The summary odds ratio with these studies excluded was 1.16 (95% CI = 0.95–1.41).

Using meta-regression to adjust for sampling bias and misclassification bias the summary odds ratio was 1.01 (95% CI = 0.84–1.22).

The funnel graph for the second analysis also shows a paucity in the left lower portion (Figure 15). Three measures of publication were positive, and “trim and fill” methods indicated the absence of two studies. The summary odds ratio, adjusting for publication bias, was 1.10 (95% CI = 0.88–1.39).

Prevalence of HPV in the third analysis (selective HPV) was nearly identical in intact and circumcised men (OR = 1.01 and 95% CI = 0.80–1.28). Three studies were identified as outliers. Exclusion of the study with the largest contribution to the between-study heterogeneity [86] resulted in the between-study heterogeneity coming with an acceptable range ( ) and yielded a summary odds ratio of 0.96 (95% CI = 0.79–1.15). The funnel plot for the studies included in the third analysis was symmetrical, all measures of publication bias were negative, and no addition of studies were indicated by the “trim and fill” analysis.

There are several messages from the three analyses performed on the HPV prevalence studies. Sampling bias and misclassification bias have a significant differential effect on the odds ratios reported in studies where these forms of bias are suspected. There is no significant difference in the incidence or the prevalence of HPV (especially oncogenic HPV) on the basis of circumcision status. While circumcision proponents repeatedly laud circumcision as preventive for HPV infections, the data do not support this claim. When their own studies are adjusted for lead-time bias and sampling bias, their treatment effect disappears [2628].

Several studies of HPV and circumcision status warrant additional comment because of their serious methodological flaws. One study compiled data collected from seven studies in five countries from three continents. A fatal flaw in the study was the small number of circumcised men in four of the countries and the small number of intact men in the fifth country. Of the twenty data cells that make up the two-by-two tables from the five countries, seven had five or fewer subjects. The authors used parametric statistical methods, which are notably unreliable in this situation, to report the statistics on the combined data [99]. Unfortunately, this study, which did not find a statistically significant association between circumcision status of male sexual partners and cervical cancer, has been quoted by circumcision proponents, including the authors of the study, as demonstrating that circumcision prevents cervical cancer. Given the problems with small number of men in many of the data cells described above, it would be impossible to accurately perform the subset results they reported for cervical cancer.

The study published by Lajous et al. is problematic in that fourteen men were identified as circumcised on physical examination, while 95 men identified themselves as being circumcised. Although physical examination is considered the gold standard for assigning circumcision status, instead of using physical examination as the measure of circumcision status, the study published the association between HPV infection and self-report of circumcision. Eighty-eight of the 95 men who reported themselves as circumcised were not circumcised on the basis of physical examination [100, 101]. To defend their decision, the author stated “we chose to report the findings of self-reported circumcision. The prevalence of circumcision in Mexico is very low, and the interviewers who did the physical examination may not be accustomed to it and may have been unable to identify its presence.” [100]

This inability of researchers in Mexico to accurately identify the circumcision on physical examination may call into question other studies from Mexico. For example, the study by Vaccarella of Mexican men undergoing vasectomy reported a circumcision rate of 31.7% and was identified as an outlier [86]. This circumcision rate appears to be exaggerated in a country in which circumcision is rare. The studies by Giuliano et al. also recruited a third of their participants from Mexico [102].

Perhaps most concerning is the results reported from the group of researchers from Johns Hopkins, who have after publication of their studies become vociferous advocates of the benefits of circumcision [5, 6]. At the beginning of their randomized clinical trial of circumcision of adult male “volunteers” in Rakai, Uganda, “two subpreputial and shaft swabs were also obtained for future testing of human papillomavirus infection.” [103] In 2011, Tobian et al. reported the results of the HPV cultures of the glans and penile shaft at the 12-month visit of participants in their randomized clinical trial [104]. So, it is not clear why, in 2009, Tobian et al. reported the results of the difference in HPV infections incidence using only samples obtained from the preputial cavity of intact men and the coronal sulcus of circumcised men [82]. Why would Tobian and the research group from Johns Hopkins collect samples from the penile shaft and glans but only report the results from the glans?

Their randomized clinical trial ended in December 2006. In 2004, Weaver et al. published a study that demonstrated the clear differential between intact and circumcised males regarding the likelihood of HPV detection based on sampling the shaft or the glans of the penis [31]. There are only two reasons for the Johns Hopkins researchers to withhold the evidence they collected; either they were not current on the medical literature as it applied the research they were conducting and reporting or they purposely withheld results of the swabs taken from the penile shaft. Neither of the options, incompetence or willful academic misconduct, is appealing. Basically, when Tobian et al. and Auvert et al. reported only on sampling from the glans, they guaranteed a positive finding because the location of HPV on the penis differs according to circumcision status [82, 105].

Research published in December 2008 had demonstrated the HPV viral load varied significantly by anatomic site with the penile shaft having the highest viral loads and being the preferred site for HPV-16 (the most prevalent oncogenic HPV type) replication [106]. There is also the question of whether the glans of the circumcision is too dry to allow accurate sampling [107].

The pertinent question as it relates to a systematic review of the medical literature and meta-analysis is whether studies that report only on cultures taken from the glans of the penis should be included in an analysis and adjusted for or be completely dismissed as invalid?

A couple for studies have indicated that the clearance of HPV takes longer from the intact penis [35, 55, 83, 84]. If this is true, it is unclear what the clinical impact would be. HPV infections on the genitals are transitory. Consequently, if the clearance of the virus takes longer, it would be more likely to be detected in intact men. If sampling is infrequent, prolonged time to viral clearance would result in an overestimate of the incidence of infection as infections of shorter duration could have come and gone and not been detected between scheduled samplings. This is an area that warrants further research.

Finally, the data from the randomized clinical trial of adult male circumcision in Kismu, Kenya, were published in 2012. While swabs were taken from the penile shaft and the glans and the data on circumcision status were collected, the authors failed to report the overall rates of HPV infection by circumcision status [77]. If one back calculates using the rates of infections by the type of penile lesion and rates of the types of lesions by circumcision status and assumes there is no interaction between these factors, there is no statistically significant difference between HPV infection rates based on circumcision status. I wrote a letter to the editor asking that the authors provide the results of the incidence of HPV infection by circumcision status, but the editor refused to publish my letter.

4.12. Any Sexually Transmitted Infections

This is the first systematic review of the medical literature looking at the incidence and the prevalence of any STI as opposed to not acquiring an STI based on circumcision status. This analysis indicates that prevalence of acquiring any STI is lower in intact men. Three of the four studies of incidence are consistent with the prevalence date, while one study from New Zealand indicated a significant protective effect. Overall, the incidence data indicate a trend that intact men have a lower incidence of any STI.

When looking at the funnel graph for any STI, the study by Langeni [85] is a clear outlier (Figure 16). When the Langeni study are excluded, the summary odds ratio drops from 0.86 (95% CI = 0.74–1.01) to 0.82 (95% CI = 0.74–0.92). While the odds ratio does not change drastically, the confidence interval is tightened by the 203.41 drop in the chi-square value for between-study heterogeneity. With Langeni included, four of the six measure of publication bias were positive. Once Langeni was excluded, one of the measures of publication bias was positive. Consequently, the analyses of any sexually transmitted disease were performed with Langeni included and with Langeni excluded.

Langeni may also be justifiably excluded because the study reported participant self-report of either GUD or GDS, which might exclude several types of STIs and relied on self-diagnosis in Botswana.

With Langeni excluded, the prevalence of any STI is significantly lower in intact men. When only high-risk populations are considered, the trend is in the same direction, but the difference is not statistically significant. The funnel graph, with the exclusion of Langeni, is fairly symmetric. “Trim and fill” analysis found that no studies needed to be added whether or not Langeni was included.

STIs with genital discharges are more common than genital ulcers, which may explain why the prevalence of any STI is lower in intact males. The ratio of the two general types of STIs within a community may also influence the impact of circumcision on overall risk of having any STI. Differences in these ratios in different populations may also contribute to the between-study heterogeneity.

Identifying and quantitating “any STI” may be problematic as the outcome of interest varied between studies. In some studies, collected data were the recollection of any STI in one’s lifetime, while, in others, it was the recollection of any STI within the past 12 months. The range of infections tested for or queried about also varied between studies. Likewise, determinations needed to be made regarding what was an STI. Is a yeast infection a sexually transmitted infection or the result of an imbalance of normal flora? In this analysis, candidal infections as well as infections with T. vaginalis, mycoplasma, and ureaplasma were not included. How much this variation affected the summary effect is unknown.

It is clear that despite these methodological concerns that the impact of circumcision on the overall risk of contracting any STI is to increase the overall risk of infection. Because of the hodgepodge of data included in this analysis and disparate results on the incidence of infection, more studies specifically designed to answer this question are needed.

4.13. General Findings

Several consistencies in the analyses deserve comment. All of the prevalence analyses showed significant between-study heterogeneity. This reflects the variety of populations, settings, diagnostic methods, and ways of determining circumcision status. Some would argue that given this degree of between-study heterogeneity, any meta-analysis that follows is not worthy of publication. Because of the between-study heterogeneity, one cannot sufficiently emphasize a disclaimer of caveat emptor. I have erred on the side that information is good, especially when properly presented. Looking at the data from different perspectives and applying different techniques that might help identify the sources of between-study heterogeneity should guide the reader in how to interpret this information.

The summary effect for the prevalence of every disease was greater in studies of high-risk populations than in studies general populations. This consistent finding, which was often statistically significant, has public policy implications. Calls for population-wide implementation of male circumcision on the grounds that it prevents STIs are not supported by the findings of these analyses. These analyses indicate that if male circumcision has any role (which these analyses also dispute) in reducing the incidence and prevalence of STIs, it should be implemented in easily identifiable high-risk populations. A major problem with infant circumcision is the lack of an accurate method of identifying which infants will find themselves in high-risk population when they become sexually active. Similarly, meta-regression analysis of the studies of HIV incidence and prevalence has found that there is no significant association in general populations but only in high-risk populations [108].

In several analyses, the summary effect of the prevalence of a disease was significantly and positively associated with circumcision prevalence in the population studied. A similar finding has been identified in studies of HIV incidence and prevalence [108]. These findings are consistent with how sexual networks impact the spread STIs [109]. Sexual partners are not found randomly but usually within one’s cultural or ethnic group. Since circumcision status has a strong association with religious, tribal, and cultural factors, men with a particular circumcision status will likely have sexual partners from within a group that has a predominance of men with the same circumcision status. The smaller the group, the more quickly the rise and the higher the peak prevalence for a particular STI [109]. Consequently, when circumcision rates are high, intact men would be more likely to be in a smaller ethnic, religious, or cultural group and thus have a higher peak prevalence of a disease. As the circumcision prevalence drops, circumcised men would find themselves in the smaller groups that would be more likely to have a higher peak prevalence of infections.

The lack of a significant association between high-risk HPV infections and circumcision status undermines the argument made by the few who believe that circumcision reduces cancer risk [8, 11, 110]. The lack of an association between HPV, HSV, and other STIs also undermines the analysis published by the same researchers at Johns Hopkins that selectively reported their HPV findings in Africa. They concluded that infant circumcision would save billions of dollars in public health expenditures, but these researchers relied almost exclusively on their own flawed data, which they failed to adjust for lead-time bias or sampling bias [10]. If circumcision increases the overall incidence and prevalence of STIs, how will it save money?

The results of these analyses also further undermine the argument of how the increased risk of HIV infection in intact men is biologically plausible. The plausibility argument is based on several assumptions, all of which are purely speculative. The first is that the inner mucosa of the foreskin is thinner and more prone to abrasions. The second is that the subpreputial space is a breeding ground for sexually transmitted viruses. The third is that the Langerhans cells on the mucosal surface act like HIV-virus magnets pulling the virus into the body [111]. The preputial mucosa is not thinner [112, 113], and circumcised men have a trend toward more penile abrasions (presumably from lack of adequate lubrication) [114]. Langerhans cells are quite efficient in killing HIV cells, which explains the low rate of transmission through sexual contact (approximately 1 in 1000 unprotected acts of coitus) and require activated T cells [115, 116]. Langerhans cells are the first line of mucosal defense. Their presence in the mucosal portion of the prepuce may explain why the overall incidence and prevalence of STIs is lower in intact men. Finally, there is no difference in the incidence and prevalence of HSV or HPV based on circumcision status. The claim that the subpreputial space is a preferential breeding ground for these viruses is also contradicted by the research that found the highest viral replication rates and viral load of HPV on the penile skin [106]. Men with genital ulcers are at greater risk because of the disruption in epithelial integrity at the site of the ulcer and the activation of T cells by the inflammation accompanying the ulcer.

4.14. Missed Studies of Interest

There are several studies that reported results that could not be incorporated into the analyses. For example, Urassa et al. reported that they did not find a significant difference in GDS or GUD prevalence in males based on circumcision status but gave no further details [79]. In 1949, Hand reported, without providing his data, no difference in the rate of HSV in soldiers on the basis of circumcision status [25]. A study of 537 sailors examined for gonorrhea before and after shore leave in the Far East found that circumcision status did not significantly affect the susceptibility to gonorrhea but provided no specifics [78].

Because circumcision status based on country of origin is inexact, a Dutch study was excluded that found that men born in the Netherlands, where circumcision is an uncommon practice, had lower rates of STIs than men who immigrated from Turkey, where circumcision is nearly uniformly practiced (one or more STI: OR = 0.30 and 95% CI = 0.12–0.72; HSV: exact OR = 0.37and 95% CI = 0.007 infinity; early syphilis: exact OR = 0.20 and 95% CI = 0.06–0.63; gonorrhea: OR = 0.20 and 95% CI = 0.06–0.63; chlamydia: OR = 0.42 and 95% CI = 0.14–1.37) [117]. These results also support the theory that minority groups have a higher peak prevalence of STIs.

Of historical interest, a study of the cause of deaths in New York City in 1931 found that death from syphilis and related diagnoses was lower in Jews than non-Jews (Poisson regression RR = 0.66 and 95% CI = 0.51–0.86). When only males are considered, the results are similar (Poisson regression RR = 0.66 and 95% CI = 0.49–0.88). If circumcision was a contributing factor, beyond that seen for ethnicity alone, one would expect a significant interaction between ethnicity and gender in which Jewish men would have a lower rate of syphilis than Jewish women. Such an interaction could not be demonstrated ( ) [118]. Likewise, Jewish men and women were found less likely to have syphilis in 1882–1883, but, once again, the lack of interaction between ethnicity and gender ( ) fails to support circumcision as a contributing factor [119]. The differences in the rates of lues between ethnic groups can be explained by a lack of sexual mixing between the two populations. For example, Christian prostitutes were banned from consorting with Jews [120].

4.15. Methodological Choices

This paper did not review the literature for HIV infections for two reasons. First, such a review would be lengthy and best left to another article. Second, most of the study of HIV and circumcision status has taken place in Africa. In that setting that is estimated 20% or more of infections are not spread through sexual contact [121128]. Using the data from the three African randomized clinical trials in adult males that looked for an association between circumcision and incidence of HIV infection [103, 129, 130], it appears that approximately half of the infections documented in these studies were transmitted through nonsexual means [131]. None of these trials made any attempt to determine the source of HIV infection documented in the trials. Consequently, since it is not clear whether the HIV infections identified in African studies were sexually transmitted or iatrogenic infections, HIV infections were excluded from this paper.

A drawback seen in some observational studies is having a small number of patients with a specific outcome. When this occurs, the parametric assumptions that allow one to make accurate inferences may no longer be valid, resulting inaccurate estimates for odds ratios and 95% confidence intervals. Since these inaccurate calculations of odds ratios and variance can bias summary effects and estimates of variance, including studies with small cell populations can result in inconsistent summary estimates depending on the calculation method used [16]. To minimize any bias introduced by studies with cells with small populations, the odds ratios and confidence intervals were calculated using exact methods.

Some adjustments in the composition of control groups were necessary to provide consistency of methodology between studies. For example, Wilson compared seasoned soldiers to new recruits [23], while Hand’s control were men without any exposure to STIs.

In Mallon et al., British men referred to a dermatology specialist for penile problems were compared to a control group of patients without penile problems cared for by the same dermatologists [24]. This is a classic case of referral bias. If primary care providers are less comfortable identifying and treating problems with the complete penis, these men would be overrepresented in a referral dermatology practice. More difficult to explain is the high circumcision rate in the control group: 47.8%. Of the men with penile problems, only 23.0% were circumcised. Yet, in a representative population survey of British adults from the early 1990s, 21.9% of adult males reported being circumcised, with the highest circumcision rate (32.2%) being reported in men aged from 45 to 59 years [132]. In a 2000 British survey, 15.8% of British men reported being circumcised [133]. Clearly, a control group in which 47.8% are circumcised was not representative of the general population.

Using a control group of men without any STI is problematic. First, men without a detectable STI differ in several ways from men who have an STI and introduce a “Berksonian bias” [134]. Some have the mistaken belief that contracting a different STI introduces unidirectional bias [135]. The opposite is likely the case. Excluding men with a different STI is more likely to introduce bias. For example, if, while investigating for association between the prevalence of gonorrhea and circumcision status, all men with syphilis, whether or not they have gonorrhea, are excluded, the measure of association will be biased because intact men presumably have a higher prevalence of syphilis. By excluding a disproportionate number of intact men, the odds ratio for intact men having gonorrhea, after excluding those with syphilis, will be higher than if these men had been included. Similarly, if men with genital warts, which is more common among circumcised men, are excluded, then the odds ratio for intact men having gonorrhea will decrease. In order to justify excluding these men from the analysis, these other conditions would need to be shown to be confounding factors or effect modifiers for gonorrhea. This has not been demonstrated for the diseases in these analyses.

Second, using a disease-free control group discards data collected on men who had an STI other than the infection of interest. Those who participate in medical research allow their medical information to be used and their privacy to be violated. Violating a subject’s privacy to collect data and then not use the information excludes useful information and is ethically suspect. Every participant’s information should contribute to a study, and so serious deliberation needs to be undertaken before this information is arbitrarily excluded from analysis. If the aim of a study is to consider a specific infection, the data on all patients meeting the inclusion criteria should be incorporated into the analysis. For example, in a cross-sectional study, the characteristics of men with the disease of interest would be compared to the characteristics of men without the illness, regardless if they happen to have a different type of infection.

Finally, it provides a method of comparison that is consistent with the other studies included in the meta-analysis.

Many prefer to use individual patient data in meta-analyses for a variety of reasons [136]. First, not all studies adjust their results for confounding factors. In fact, most studies identified in this paper did not. Second, studies that provided adjusted odds ratios do not consistently adjust for the same factors, so adjusted results from different studies are not comparable. Third, most studies that report adjusted results rarely perform evaluations for collinearity, which can destabilize multivariate models. Circumcision status has been noted in several studies to be a differential factor in the number of lifetime sexual partners, marriage rates, contact with prostitutes, and tobacco and alcohol consumption [137, 138]. If a study were to adjust for one of these factors, they might find that particular factor is significant, circumcision is significant, or both are significant, when the truth is that circumcision is linked to the other factor and the two variables in a multivariate model are describing the same thing. Fourth, when adjusted odds ratios are calculated, the uncertainty (variance) of the estimate increases. When calculating a summary effect, the weight assigned to data from an individual study is the inverse of the variance. An adjusted odds ratio will have a larger variance and give the study less weight when determining the summary effect than the unadjusted odds ratio would. For example, in the study by Laumann et al. [89], the weight assigned to the raw data is from 3.6 to 6.6 times greater than the weight assigned to the adjusted odds ratios. Similarly, in a study by Urassa et al., going from raw data to an adjusted odds ratio increased the variance from 0.000685 to 0.0153 [79]. Subsequently, a much smaller and less rigorous study that reported only raw data would have more impact on the summary effect than a large nationally representative probability sample using adjusted odds ratios. Fifth, adjusted odds ratios are open to manipulation using multivariate logistic regression. Consequently, using raw data will diminish the impact of researcher bias and avoid overfitting the data with multivariate analysis.

One of the most important tasks in performing the literature review is looking for forms of bias and making adjustments to minimize the impact of differential bias. Bias happens, and it is hard to identify and control. Most forms of bias are insidious and difficult to measure. Circumcision status, which is linked to socioeconomic status, may impact healthcare seeking behaviors. If, for example, circumcised men are more likely to visit an STD clinic for reassurance purposes, they would be more likely to be placed in a no disease only control group thus increasing the odds ratio for those intact men and the illness of interest [139].

Lead-time bias was present in all of the data coming from the randomized clinical trials of adult male circumcision in Africa. Because men randomized to immediate circumcision were not exposed to STIs for four to six weeks following their procedures, their exposure to disease was not the same as men who were assigned to later circumcision. While a six-week adjustment to trials scheduled to last from 21 to 24 months wound not appear to be substantial, when the reduced exposure time is accounted for, several of the associations found that these trials were no longer statistically significant. If these findings were robust, adjusting for lead-time bias should not have influenced the interpretation of the results.

What is more concerning is that potential for lead-time bias was overlooked in the planning, funding, analysis, and reporting phases of these projects. The potential for lead-time bias in any cohort study or clinical trial is taught and emphasized in the most basic classes on research design. How was this potential source of bias missed by the highly regarded researchers at Johns Hopkins, the reviewers who approved funding for these studies at the National Institutes of Health, and the editors and peer-reviewers at highly regarded medicals journals such as The Lancet and The New England Journal of Medicine? To compensate for the deficiencies of these individuals, a post hoc adjustment of six weeks lead time was made. Six weeks were chosen to be on the conservative side.

The need to adjust for sampling bias in the studies of HPV is quite apparent. Multiple studies have found that the location of HPV on the penis is differentiated by circumcision status [2935] and meta-regression has found that studies that sample only the glans have a significant difference in the odds ratio. The problem is that the entire treatment effect reported in studies that sampled only the glans [82, 105] can be attributed to sampling bias [26, 28]. Unfortunately, these studies are widely cited. While it could be argued that failure to sample the penile shaft is a fatal flaw, an adjustment for the number of infections missed is a straight forward solution. Doing so for the studies of disease incidence brought these studies in line with other studies that adequately sampled the genitals.

Nondifferential misclassification is a concern as the correlation between circumcision status based on patient report and physical examination can vary widely depending on the population studied. [79, 100, 140145]. Method of determining circumcision status was a significant factor in the meta-regression of studies of the prevalence of HPV. For some study designs, ascertaining circumcision status is not practical. For example, a number of studies of using representative samples of the general population relied on the subject report for circumcision status (Table 1).

Reliance on the patient report to document an STI introduces a potential for recall bias and may underestimate the incidence of STIs. This would only introduce bias if a differential ability to recall and report medically diagnosed sexually transmitted disease was linked to circumcision status [146]. There is no reason to believe it is.

Searching for sources of bias also occurs in a meta-analysis, particularly for those involving observational studies, when looking at the impact of various factors on between-study heterogeneity. Some consider accounting for contributions to between-study heterogeneity is an obligation for the investigator and the most important task in performing a meta-analysis [37]. It is particularly important for observational studies, which, compared to randomized clinical and controlled trials, are, on average, likely to overestimate the true odds ratio by 30% [147]. Other methods that look to reduce between-study heterogeneity include the search for and the exclusion of studies that contain appreciable outlier data [148], sensitivity analysis, and meta-regression.

Most of the between-study heterogeneity can likely be attributed to methodological limitations in the source studies and the inherent biases in study design. Many of the studies included in these analyses reported information collected at STD clinics. While these clinics provided concentrated clinical material at one location, their clientele does not reflect the characteristics and risk factors for disease seen in the general population and may introduce a selection bias that unduly influences the results generated [109]. Intact and circumcised men may not use these health facilities with equal frequency for similar indications. For example, in the United States and England, men with higher socioeconomic status are more likely to be circumcised and more likely to have an STI treated by a physician in private practice rather than at an STD clinic. Health-seeking behaviors may be different in circumcised men who might be more likely to seek care for minor abrasions thus being placed in control group more frequently than their intact cohorts [134].

4.16. Shortcomings of Meta-Analysis

Meta-analysis is an inexact tool and best applied to randomized controlled trials. It has inherent weaknesses when applied to observational studies, so guidelines on how to undertake this process have been published proposed [22]. The validity of a meta-analysis of observational studies is related to study quality. The simple inclusion criteria allowed several studies of less than optimal quality to be included; however, more exclusive criteria can be subject to researcher bias and be manipulated to obtain specific results [15]. The simple inclusion criteria may contribute to the between-study heterogeneity.

The analyses presented in this paper used a random-effects model to determine summary effects and confidence intervals. The alternative, fixed-effects models assume a single true effect common to all studies. Any variation would be attributed only to sampling error. Random-effects models allow for a true random component as a source of variation in effect size between studies as well as sampling error [149]. If between-study heterogeneity is low, the random-effects model will give an estimate and confidence intervals similar to a fixed-effects model. In general, random-effects models are preferred because the assumptions for a fixed-effects model to be accurate are rarely satisfied [150].

One limitation of this systematic review, or any systematic review, is the inability to find all sources of data using any search strategy. All search strategies have an ascertainment bias: the goal is to diminish this bias by finding as many relevant studies as feasible. So, there may be published and unpublished studies that were not included.

The measures of publication bias are a mathematical attempt to quantify the gestalt of looking a funnel graph and determining if it looks like an inverted funnel. Each measure of publication bias has its strengths and weakness [40], but since there are no comparative analyses of the different methods of identifying publication bias, and the gold standard is our gestalt, all of the measures of publication bias should be used [151]. They are often less than helpful. In the analyses published here, the results between the six different measures were often inconsistent, and funnel graphs that looked asymmetric in several instances did not have positive measures of publication bias and did not generate an intervention using “trim and fill” analysis. The trim portion of the “trim and fill” method is handicapped by being based solely on rank, without consideration of study size. Consequently, adjustments for publication bias should be viewed with caution as asymmetry of the funnel plot may be due to factors other than publication bias, and, likewise, results generated to correct for the asymmetry may not reflect a correction for publication bias [151].

5. Summary

The results of these meta-analysis should be taken with caution. The trials they are based on come from a number of sources with a number of different methodologies. Some studies employed exemplary methodology, while others were published in high-profile medical journals, such as the New England Journal of Medicine and The Lancet, and contained serious and possibly fatal methodological flaws. Some forms of the differential bias could be identified and adjusted for, but there are likely many forms of bias that cannot be identified. All of the analyses had significant between-study heterogeneity, which undermines the robustness of any of the findings.

Most specific STIs are not impacted significantly by circumcision status. These include chlamydia, gonorrhea, HSV, and HPV. Syphilis showed mixed results with prevalence studies suggesting intact men were at great risk and incidence studies suggesting the opposite. Intact men appear to be greater risk for GUD while at lower risk for GDS, NSU, genital warts, and the overall risk of any STIs. It is also clear that any positive impact of circumcision on STIs is not seen in general populations. Consequently, the prevention of STIs cannot be rationally interpreted as a benefit of circumcision, and a policy of circumcision for the general population to prevent STIs is not supported by the evidence currently available in the medical literature.