Vaginal Microbiome Metagenome Inference Accuracy: Differential Measurement Error according to Community Composition

From BugSigDB
Needs review
study design
PMID PubMed identifier for scientific articles.
DOI Digital object identifier for electronic documents.
Carter KA, Fodor AA, Balkus JE, Zhang A, Serrano MG, Buck GA, Engel SM, Wu MC, Sun S
Lactobacillus crispatus, Lactobacillus iners, measurement error, metagenome inference, vaginal microbiome
Several studies have compared metagenome inference performance in different human body sites; however, none specifically reported on the vaginal microbiome. Findings from other body sites cannot easily be generalized to the vaginal microbiome due to unique features of vaginal microbial ecology, and investigators seeking to use metagenome inference in vaginal microbiome research are "flying blind" with respect to potential bias these methods may introduce into analyses. We compared the performance of PICRUSt2 and Tax4Fun2 using paired 16S rRNA gene amplicon sequencing and whole-metagenome sequencing data from vaginal samples from 72 pregnant individuals enrolled in the Pregnancy, Infection, and Nutrition (PIN) cohort. Participants were selected from those with known birth outcomes and adequate 16S rRNA gene amplicon sequencing data in a case-control design. Cases experienced early preterm birth (<32 weeks of gestation), and controls experienced term birth (37 to 41 weeks of gestation). PICRUSt2 and Tax4Fun2 performed modestly overall (median Spearman correlation coefficients between observed and predicted KEGG ortholog [KO] relative abundances of 0.20 and 0.22, respectively). Both methods performed best among Lactobacillus crispatus-dominated vaginal microbiotas (median Spearman correlation coefficients of 0.24 and 0.25, respectively) and worst among Lactobacillus iners-dominated microbiotas (median Spearman correlation coefficients of 0.06 and 0.11, respectively). The same pattern was observed when evaluating correlations between univariable hypothesis test P values generated with observed and predicted metagenome data. Differential metagenome inference performance across vaginal microbiota community types can be considered differential measurement error, which often causes differential misclassification. As such, metagenome inference will introduce hard-to-predict bias (toward or away from the null) in vaginal microbiome research. IMPORTANCE Compared to taxonomic composition, the functional potential within a bacterial community is more relevant to establishing mechanistic understandings and causal relationships between the microbiome and health outcomes. Metagenome inference attempts to bridge the gap between 16S rRNA gene amplicon sequencing and whole-metagenome sequencing by predicting a microbiome's gene content based on its taxonomic composition and annotated genome sequences of its members. Metagenome inference methods have been evaluated primarily among gut samples, where they appear to perform fairly well. Here, we show that metagenome inference performance is markedly worse for the vaginal microbiome and that performance varies across common vaginal microbiome community types. Because these community types are associated with sexual and reproductive outcomes, differential metagenome inference performance will bias vaginal microbiome studies, obscuring relationships of interest. Results from such studies should be interpreted with substantial caution and the understanding that they may over- or underestimate associations with metagenome content.

Experiment 1

Needs review

Curated date: 2023/11/04

Curator: Chinelsy

Revision editor(s): Chinelsy


Location of subjects
United States of America
Host species Species from which microbiome was sampled. Contact us to have more species added.
Homo sapiens
Body site Anatomical site where microbial samples were extracted from according to the Uber Anatomy Ontology
Vagina Distal oviductal region,Distal portion of oviduct,Vaginae,Vagina
Condition The experimental condition / phenotype studied according to the Experimental Factor Ontology
premature birth Birth, Premature,Birth, Preterm,Births, Premature,Births, Preterm,Premature Births,Preterm Birth,Preterm Births,premature birth
Group 0 name Corresponds to the control (unexposed) group for case-control studies
term birth(control)
Group 1 name Corresponds to the case (exposed) group for case-control studies
preterm birth (PTB)
Group 1 definition Diagnostic criteria applied to define the specific condition / phenotype represented in the case (exposed) group
Cases were participants who experienced early preterm birth at <32 weeks of gestation.
Group 0 sample size Number of subjects in the control (unexposed) group
Group 1 sample size Number of subjects in the case (exposed) group

Lab analysis

Sequencing type
16S variable region One or more hypervariable region(s) of the bacterial 16S gene
Sequencing platform Manufacturer and experimental platform used for quantifying microbial abundance
Data transformation Data transformation applied to microbial abundance measurements prior to differential abundance testing (if any).
relative abundances