External comparison report for Gut microbiome of Yakuts without Viliuisk encephalomyelitis compared to other cohorts

Report summary

Compare user data with the volume of curated published metagenomes (without analysis of factors).

Created25/04/2019
Updated09/10/2019
TypeExternal comparison report
ProjectGut microbiome of Yakuts without Viliuisk encephalomyelitis compared to other cohorts
Uploaded samples11

External data info

Below is the description of external datasets (PMID or PMC and title of the article).

userdata306
Klimenko, N., Tyakht, A., Popenko, A., Vasiliev, A., Altukhov, I., Ischenko, D., ... & Musienko, S. (2018). Microbiome responses to an uncontrolled short-term diet intervention in the frame of the citizen science project. Nutrients, 10(5), 576.

Taxonomic composition

Heatmap of taxonomic composition

Interactive heatmap represents relative abundance of major microbial taxa (columns) in the samples (rows), for each taxonomic level. The last row represent mean values for external data. White colour corresponds to absent taxa. Using the drop-down list “Heatmap settings” on the right of the heatmap, users can select taxonomic level of interest. For convenience of comparison between close values, clicking on a cell “freezes” the displayed value of cell on the legend and additionally the displayed abundance of top 10 taxa and factors value of corresponding sample (click again or on the cross near sample name to “unfreeze”).

Major taxa

The boxplots represent distribution of relative abundance for 25 most abundant taxa across all samples (for each taxonomic rank). For proper display on log scale, zero values were replaced with a pseudocount not higher than minimum value of relative abundance of major taxa.

Taxonomic core

The plot represents the proportion of OTUs shared across the varying proportion of samples.

Download taxa_core.svg

Analysis of outliers

Automatic filtering of the user samples with extreme taxonomic composition (based on the combined analysis of user and external data). Analysis of outliers: samples in upper 1% tail of distribution of median distance between each sample and closest 50% of neighbours approximated by normal distribution. List of outliers (users and external data):

V15.R1, V9.R1, pmiduserdata306_6363.7798, pmiduserdata306_3487.1543, pmiduserdata306_0018.0015

PCoA visualization based on taxonomic composition

Distribution of the samples by their taxonomic composition in reduced dimensionality. The closer the samples (points) on the plot, the more similar their composition. Vectors show the directions in which the levels of the respective major taxa increase. Method of dimension reduction: PCoA (Principal Coordinate Analysis); dissimilarity metric: weighted UniFrac. Clicking on a dot “freezes” the detailed information about the sample on the right of the plot (click again or on the cross near sample name to “unfreeze”). Switch between the display modes with or without outliers and with or without vectors showing major microbial “drivers” using the respective controls.

Alpha-diversity

The measure describes the conditional number of taxa in each sample. Metric: Shannon index.

Сomparison

Wilcoxon rank-sum test is applied to compare the alpha-diversity between the two groups.

Alpha-diversity is significantly different in the groups(p= 0.0)

alpha_div_plot_int

Taxa co-occurence analysis

Co-occurence graph

Co-occurrence of microbial genera was analyzed basing on correlation analysis of their relative abundance using SPIEC-EASI software. In the graph, vertices show genera; pairs of highly co-occurring genera are connected with blue lines. The graph shows the members of the cooperatives - groups of highly co-occurring genera corresponding to isolated components (singleton vertices are omitted). Parameters of SPIEC-EASI algorithm: Meinshausen and Bühlmann neighbourhood selection method (MB), minimum lambda ratio= 0.1, number of lambda iterations = 20, model selection using StARS algorithm (number of StARS subsamples = 50).

Statistical analysis

General difference of community structure between two groups

Test if there are significant differences in overall community composition between the samples of two groups. Method: permutational multivariate analysis of variance (PERMANOVA), beta-diversity metric: weighted UniFrac. The result includes the total number of samples, number of PERMANOVA permutations, p-value for the null hypothesis that there is no difference between the groups, as well as information on the equality of group dispersions (obtained using PERMDISP method with same number of permutations). If the group variations are not equal, the results should be interpreted with caution. Samples-outliers listed in the taxonomic composition section are excluded from this analysis.

parameter value
sample size 104
number of permutations 20000
significance level 0.05
group variations equal (p = 0.45)
R-squared 0.057
p-value 0.0

General difference of metabolic potential structure between two groups

Test if there are significant differences in overall metabolic structure between the samples of two groups. Method: permutational multivariate analysis of variance (PERMANOVA), beta-diversity metric: Bray-Curtis distance. The result includes the total number of samples, number of PERMANOVA permutations, p-value for the null hypothesis that there is no difference between the groups, as well as information on the equality of group dispersions (obtained using PERMDISP method with same number of permutations). If the group variations are not equal, the results should be interpreted with caution. Samples-outliers listed in the taxonomic composition section are excluded from this analysis.

parameter value
sample size 104
number of permutations 20000
significance level 0.05
group variations equal (p = 0.682)
R-squared 0.029
p-value 0.017

Taxonomic composition

Individual microbial taxa for which relative abundance is significantly different between user and external datasets are identified.

Wilcoxon test comparison

Method: Wilcoxon rank-sum test. The analysis includes the following steps: filtration of rare taxa (taxon must be present in at least 10% of the samples at the level of >0.2%), Wilcoxon rank-sum test applied to each taxon to detect the taxa differentially abundant between the user and external data. Multiple testing adjustment is performed using Benjamini–Hochberg procedure. Contribution of each taxon to the inter-group difference is estimated using LDA method. Samples-outliers listed in the taxonomic composition section were excluded from this analysis.

Differentially abundant taxa

Tables of differentially abundant taxa overpresented in the groups

Overpresented in group: external_data

taxon taxa level user_data median, % external_data median, % p-value adjusted p-value lda score
c__Clostridia class 69.24 77.556 0.004 0.025 4.697
o__Clostridiales order 69.24 77.646 0.004 0.026 4.716
f__Alcaligenaceae family 0.00 0.080 0.013 0.037 4.590
f__Rikenellaceae family 0.12 0.380 0.007 0.029 4.583
f__u(o__Clostridiales) family 5.48 12.974 0.000 0.000 4.546
f__[Odoribacteraceae] family 0.00 0.120 0.003 0.013 4.226
f__[Barnesiellaceae] family 0.00 0.140 0.001 0.004 3.948
f__Bacteroidaceae family 0.60 6.549 0.000 0.001 3.510
f__Lachnospiraceae family 21.38 27.579 0.008 0.029 3.501
g__u(o__Clostridiales) genus 5.48 13.011 0.000 0.000 4.593
g__Bacteroides genus 0.60 6.565 0.000 0.001 4.431
g__u(f__Lachnospiraceae) genus 6.74 11.899 0.010 0.026 4.376
g__Phascolarctobacterium genus 0.00 0.420 0.000 0.002 3.652
g__Oscillospira genus 0.54 0.864 0.005 0.020 3.613
g__u(f__Erysipelotrichaceae) genus 0.02 0.401 0.002 0.008 3.521
g__Adlercreutzia genus 0.00 0.040 0.000 0.002 3.425
g__u(f__Rikenellaceae) genus 0.12 0.381 0.007 0.022 3.410
g__u(f__[Barnesiellaceae]) genus 0.00 0.141 0.001 0.003 3.404
g__Lactococcus genus 0.00 0.040 0.006 0.020 3.380
g__Sutterella genus 0.00 0.060 0.014 0.035 3.378
g__Anaerostipes genus 0.02 0.100 0.016 0.035 3.223
s__u(o__Clostridiales) species 5.48 13.021 0.000 0.000 4.597
s__u(f__Lachnospiraceae) species 6.74 11.906 0.010 0.025 4.326
s__u(g__Bacteroides) species 0.20 4.770 0.000 0.001 4.297
g__Blautia s__obeum species 0.04 0.642 0.000 0.000 3.762
g__Coprococcus s__eutactus species 0.00 0.121 0.020 0.044 3.759
s__u(g__Adlercreutzia) species 0.00 0.040 0.000 0.001 3.688
s__u(g__Clostridium) species 0.26 0.805 0.000 0.001 3.659
s__u(g__Phascolarctobacterium) species 0.00 0.420 0.000 0.001 3.657
g__Bacteroides s__uniformis species 0.04 0.501 0.000 0.001 3.629
s__u(g__Oscillospira) species 0.54 0.866 0.005 0.018 3.507
s__u(f__Erysipelotrichaceae) species 0.02 0.401 0.002 0.007 3.492
s__u(f__Rikenellaceae) species 0.12 0.382 0.007 0.022 3.476
s__u(g__Lactococcus) species 0.00 0.040 0.001 0.006 3.437
s__u(f__[Barnesiellaceae]) species 0.00 0.141 0.001 0.003 3.420
g__[Ruminococcus] s__torques species 0.00 0.040 0.008 0.023 3.400
g__Bacteroides s__ovatus species 0.00 0.140 0.000 0.001 3.390
s__u(g__Anaerostipes) species 0.02 0.100 0.016 0.038 3.372
s__u(g__Sutterella) species 0.00 0.061 0.014 0.035 3.345

Overpresented in group: user_data

taxon taxa level user_data median, % external_data median, % p-value adjusted p-value lda score
c__Erysipelotrichi class 7.06 1.500 0.001 0.008 4.783
c__Coriobacteriia class 3.58 1.160 0.010 0.044 4.402
o__Erysipelotrichales order 7.06 1.501 0.001 0.008 4.737
o__Coriobacteriales order 3.58 1.160 0.010 0.048 4.492
f__Coriobacteriaceae family 3.58 1.162 0.011 0.034 4.457
f__Erysipelotrichaceae family 7.06 1.502 0.001 0.004 4.109
f__Veillonellaceae family 3.56 1.361 0.015 0.040 3.630
f__Lactobacillaceae family 0.26 0.000 0.001 0.004 3.250
g__Catenibacterium genus 2.16 0.000 0.000 0.001 4.385
g__Lactobacillus genus 0.26 0.000 0.000 0.003 4.201
g__[Eubacterium] genus 1.52 0.253 0.016 0.035 3.963
g__Dialister genus 1.96 0.020 0.011 0.028 3.913
g__[Ruminococcus] genus 1.86 0.341 0.000 0.002 3.814
g__u(f__Coriobacteriaceae) genus 1.62 0.421 0.018 0.037 3.732
g__Collinsella genus 1.62 0.441 0.003 0.013 3.707
g__Desulfovibrio genus 0.06 0.000 0.006 0.020 3.619
g__Peptococcus genus 0.00 0.000 0.003 0.013 3.526
g__u(f__Clostridiaceae) genus 0.82 0.321 0.003 0.013 3.493
g__Roseburia genus 0.42 0.080 0.008 0.022 3.216
s__u(g__Catenibacterium) species 2.16 0.000 0.000 0.001 4.474
s__u(g__[Ruminococcus]) species 1.48 0.040 0.000 0.000 4.037
s__u(g__Dialister) species 1.96 0.020 0.011 0.028 3.935
g__Blautia s__producta species 1.12 0.020 0.000 0.000 3.879
s__u(f__Coriobacteriaceae) species 1.62 0.421 0.018 0.040 3.792
g__Collinsella s__aerofaciens species 1.48 0.441 0.004 0.015 3.780
s__u(g__Coprococcus) species 0.90 0.641 0.006 0.019 3.604
s__u(g__Peptococcus) species 0.00 0.000 0.003 0.012 3.575
s__u(f__Clostridiaceae) species 0.82 0.321 0.003 0.012 3.487
s__u(g__Roseburia) species 0.42 0.060 0.007 0.021 3.164

Cladogram of differences

Tree-like summary of the taxa differentially abundant in two groups constructed using LefSe.

List of differentially abundant taxa

increased in user_data

denotation feature
b g__Collinsella
c g__u(f__Coriobacteriaceae)
d f__Coriobacteriaceae
e o__Coriobacteriales
f c__Coriobacteriia
n g__Lactobacillus
o f__Lactobacillaceae
q g__u(f__Clostridiaceae)
s g__Roseburia
t g__[Ruminococcus]
w g__Peptococcus
y g__Dialister
a0 f__Veillonellaceae
a5 g__Catenibacterium
a6 g__[Eubacterium]
a8 f__Erysipelotrichaceae
a9 o__Erysipelotrichales
b0 c__Erysipelotrichi
b3 g__Desulfovibrio

increased in external_data

denotation feature
a g__Adlercreutzia
g g__Bacteroides
h f__Bacteroidaceae
i g__u(f__Rikenellaceae)
j f__Rikenellaceae
k g__u(f__[Barnesiellaceae])
l f__[Barnesiellaceae]
m f__[Odoribacteraceae]
p g__Lactococcus
r g__Anaerostipes
u g__u(f__Lachnospiraceae)
v f__Lachnospiraceae
x g__Oscillospira
z g__Phascolarctobacterium
a1 g__u(o__Clostridiales)
a2 f__u(o__Clostridiales)
a3 o__Clostridiales
a4 c__Clostridia
a7 g__u(f__Erysipelotrichaceae)
b1 g__Sutterella
b2 f__Alcaligenaceae

Excluded features

phylum

k__Bacteria;p__Cyanobacteria, k__Bacteria;p__Fusobacteria, k__Bacteria;p__Synergistetes, k__Bacteria;p__TM7, k__Bacteria;p__[Thermi]

class

c__Flavobacteriia, c__4C0d-2, c__Chloroplast, c__Fusobacteriia, c__Synergistia, c__TM7-3, c__Verruco-5, c__Deinococci

order

o__Actinomycetales, o__Flavobacteriales, o__YS2, o__Stramenopiles, o__Streptophyta, o__Gemellales, o__u(c__Clostridia), o__SHA-98, o__Fusobacteriales, o__Caulobacterales, o__RF32, o__Rhizobiales, o__Sphingomonadales, o__Aeromonadales, o__Alteromonadales, o__Chromatiales, o__Oceanospirillales, o__Pasteurellales, o__Pseudomonadales, o__Xanthomonadales, o__Synergistales, o__u(c__TM7-3), o__WCHB1-41, o__Thermales

family

f__Actinomycetaceae, f__Corynebacteriaceae, f__Intrasporangiaceae, f__Microbacteriaceae, f__Micrococcaceae, f__Propionibacteriaceae, f__Streptomycetaceae, f__u(o__Bacteroidales), f__RF16, f__[Weeksellaceae], f__u(o__YS2), f__u(o__Stramenopiles), f__u(o__Streptophyta), f__u(o__Bacillales), f__Bacillaceae, f__Planococcaceae, f__Staphylococcaceae, f__[Exiguobacteraceae], f__Gemellaceae, f__Aerococcaceae, f__Carnobacteriaceae, f__Leuconostocaceae, f__u(c__Clostridia), f__Dehalobacteriaceae, f__Eubacteriaceae, f__Peptostreptococcaceae, f__[Tissierellaceae], f__u(o__SHA-98), f__Fusobacteriaceae, f__Caulobacteraceae, f__u(o__RF32), f__Beijerinckiaceae, f__Bradyrhizobiaceae, f__Brucellaceae, f__Methylobacteriaceae, f__Rhizobiaceae, f__Sphingomonadaceae, f__Comamonadaceae, f__Oxalobacteraceae, f__Succinivibrionaceae, f__Alteromonadaceae, f__Idiomarinaceae, f__[Chromatiaceae], f__Ectothiorhodospiraceae, f__Halomonadaceae, f__Pasteurellaceae, f__Moraxellaceae, f__Pseudomonadaceae, f__Xanthomonadaceae, f__Dethiosulfovibrionaceae, f__Synergistaceae, f__u(c__TM7-3), f__RFP12, f__Thermaceae

genus

g__Methanosphaera, g__u(f__Actinomycetaceae), g__Actinomyces, g__Corynebacterium, g__u(f__Intrasporangiaceae), g__u(f__Microbacteriaceae), g__Mycetocola, g__u(f__Micrococcaceae), g__Nesterenkonia, g__Rothia, g__Propionibacterium, g__u(f__Streptomycetaceae), g__Streptomyces, g__Gardnerella, g__Atopobium, g__Eggerthella, g__u(o__Bacteroidales), g__Porphyromonas, g__u(f__Prevotellaceae), g__u(f__RF16), g__Butyricimonas, g__Odoribacter, g__u(f__[Paraprevotellaceae]), g__[Prevotella], g__Chryseobacterium, g__u(o__YS2), g__u(o__Stramenopiles), g__u(o__Streptophyta), g__u(o__Bacillales), g__u(f__Bacillaceae), g__Bacillus, g__u(f__Planococcaceae), g__Staphylococcus, g__Exiguobacterium, g__u(f__Gemellaceae), g__u(f__Aerococcaceae), g__Granulicatella, g__u(f__Enterococcaceae), g__Vagococcus, g__u(f__Lactobacillaceae), g__u(f__Leuconostocaceae), g__Leuconostoc, g__u(f__Streptococcaceae), g__u(c__Clostridia), g__Christensenella, g__02d06, g__Sarcina, g__Dehalobacterium, g__Pseudoramibacter_Eubacterium, g__Butyrivibrio, g__Epulopiscium, g__Pseudobutyrivibrio, g__Shuttleworthia, g__u(f__Peptococcaceae), g__rc4-4, g__u(f__Peptostreptococcaceae), g__Peptostreptococcus, g__Anaerotruncus, g__u(f__Veillonellaceae), g__Acidaminococcus, g__Megamonas, g__Megasphaera, g__Mitsuokella, g__Veillonella, g__Mogibacterium, g__Anaerococcus, g__Parvimonas, g__Peptoniphilus, g__WAL_1855D, g__ph2, g__u(o__SHA-98), g__Bulleidia, g__Coprobacillus, g__Holdemania, g__cc_115, g__p-75-a5, g__Fusobacterium, g__u(f__Caulobacteraceae), g__Brevundimonas, g__u(o__RF32), g__u(f__Beijerinckiaceae), g__Bradyrhizobium, g__Ochrobactrum, g__Methylobacterium, g__Shinella, g__u(f__Sphingomonadaceae), g__Blastomonas, g__Kaistobacter, g__Sphingomonas, g__u(f__Comamonadaceae), g__Comamonas, g__Delftia, g__u(f__Oxalobacteraceae), g__Ralstonia, g__Bilophila, g__Succinivibrio, g__Marinobacter, g__u(f__Idiomarinaceae), g__Idiomarina, g__Rheinheimera, g__u(f__Ectothiorhodospiraceae), g__Erwinia, g__Serratia, g__Halomonas, g__Haemophilus, g__Acinetobacter, g__u(f__Pseudomonadaceae), g__Pseudomonas, g__Stenotrophomonas, g__Pyramidobacter, g__u(c__TM7-3), g__u(f__RFP12), g__Thermus

species

s__u(g__Methanosphaera), s__u(f__Actinomycetaceae), s__u(g__Actinomyces), s__u(g__Corynebacterium), s__u(f__Intrasporangiaceae), s__u(f__Microbacteriaceae), s__u(g__Mycetocola), s__u(f__Micrococcaceae), s__u(g__Nesterenkonia), g__Rothia s__aeria, g__Rothia s__mucilaginosa, g__Propionibacterium s__acnes, s__u(f__Streptomycetaceae), s__u(g__Streptomyces), g__Bifidobacterium s__bifidum, g__Bifidobacterium s__longum, s__u(g__Gardnerella), s__u(g__Atopobium), s__u(g__Collinsella), g__Eggerthella s__lenta, s__u(o__Bacteroidales), g__Bacteroides s__eggerthii, g__Parabacteroides s__distasonis, s__u(g__Porphyromonas), s__u(f__Prevotellaceae), s__u(f__RF16), s__u(g__Butyricimonas), s__u(g__Odoribacter), s__u(f__[Paraprevotellaceae]), s__u(g__[Prevotella]), s__u(g__Chryseobacterium), s__u(o__YS2), s__u(o__Stramenopiles), s__u(o__Streptophyta), s__u(o__Bacillales), s__u(f__Bacillaceae), s__u(g__Bacillus), s__u(f__Planococcaceae), s__u(g__Staphylococcus), g__Staphylococcus s__aureus, g__Staphylococcus s__sciuri, s__u(g__Exiguobacterium), s__u(f__Gemellaceae), s__u(f__Aerococcaceae), s__u(g__Granulicatella), s__u(f__Enterococcaceae), s__u(g__Vagococcus), s__u(f__Lactobacillaceae), s__u(g__Lactobacillus), g__Lactobacillus s__mucosae, g__Lactobacillus s__reuteri, g__Lactobacillus s__ruminis, g__Lactobacillus s__vaginalis, s__u(f__Leuconostocaceae), s__u(g__Leuconostoc), s__u(f__Streptococcaceae), g__Lactococcus s__garvieae, g__Streptococcus s__anginosus, g__Streptococcus s__minor, g__Streptococcus s__sobrinus, s__u(c__Clostridia), s__u(g__Christensenella), s__u(g__02d06), g__Clostridium s__hiranonis, g__Clostridium s__perfringens, s__u(g__Sarcina), s__u(g__Dehalobacterium), s__u(g__Pseudoramibacter_Eubacterium), s__u(g__Butyrivibrio), s__u(g__Epulopiscium), s__u(g__Pseudobutyrivibrio), g__Roseburia s__faecis, s__u(g__Shuttleworthia), s__u(f__Peptococcaceae), s__u(g__rc4-4), s__u(f__Peptostreptococcaceae), g__Peptostreptococcus s__anaerobius, s__u(g__Anaerotruncus), s__u(g__Faecalibacterium), g__Ruminococcus s__bromii, g__Ruminococcus s__flavefaciens, s__u(f__Veillonellaceae), s__u(g__Acidaminococcus), s__u(g__Megamonas), s__u(g__Megasphaera), s__u(g__Mitsuokella), g__Mitsuokella s__multacida, s__u(g__Veillonella), g__Veillonella s__dispar, g__Veillonella s__parvula, s__u(g__Mogibacterium), s__u(g__Anaerococcus), s__u(g__Parvimonas), s__u(g__Peptoniphilus), s__u(g__WAL_1855D), s__u(g__ph2), s__u(o__SHA-98), s__u(g__Bulleidia), g__Bulleidia s__moorei, g__Bulleidia s__p-1630-c5, s__u(g__Coprobacillus), g__Coprobacillus s__cateniformis, s__u(g__Holdemania), g__[Eubacterium] s__cylindroides, g__[Eubacterium] s__dolichum, s__u(g__cc_115), s__u(g__p-75-a5), s__u(g__Fusobacterium), s__u(f__Caulobacteraceae), g__Brevundimonas s__diminuta, s__u(o__RF32), s__u(f__Beijerinckiaceae), s__u(g__Bradyrhizobium), s__u(g__Ochrobactrum), s__u(g__Methylobacterium), s__u(g__Shinella), s__u(f__Sphingomonadaceae), s__u(g__Blastomonas), s__u(g__Kaistobacter), s__u(g__Sphingomonas), s__u(f__Comamonadaceae), s__u(g__Comamonas), s__u(g__Delftia), s__u(f__Oxalobacteraceae), s__u(g__Ralstonia), s__u(g__Bilophila), g__Desulfovibrio s__D168, s__u(g__Succinivibrio), s__u(g__Marinobacter), s__u(f__Idiomarinaceae), s__u(g__Idiomarina), s__u(g__Rheinheimera), s__u(f__Ectothiorhodospiraceae), s__u(g__Erwinia), g__Erwinia s__dispersa, s__u(g__Serratia), s__u(g__Halomonas), g__Halomonas s__nitritophilus, s__u(g__Haemophilus), g__Haemophilus s__parainfluenzae, s__u(g__Acinetobacter), g__Acinetobacter s__guillouiae, g__Acinetobacter s__lwoffii, s__u(f__Pseudomonadaceae), s__u(g__Pseudomonas), g__Pseudomonas s__stutzeri, g__Stenotrophomonas s__geniculata, g__Pyramidobacter s__piscolens, s__u(c__TM7-3), s__u(f__RFP12), s__u(g__Thermus)

Generalized linear mixed effect model

A generalized linear model is fitted for each taxon to identify if it is differentially abundant between the user and context data. The specific probability distribution is selected heuristically depending on the number of samples. For >100 samples, a zero-inflated negative binomial regression is fitted; in other cases - a negative binomial model. Rare taxa are excluded from the analysis (a taxon must be present in at least 10% of the samples at the level of >0.2%). Multiple testing adjustment is performed using Benjamini–Hochberg procedure. Contribution of each taxon to the inter-group difference is estimated using LDA method. The information about distribution family, terms of the model and sample size is displayed in "Model details" section.

Differentially abundant taxa

Tables of differentially abundant taxa overpresented in the groups

Overpresented in group: external_data

taxon taxa level user_data mean, % user_data sd, % external_data mean, % external_data sd, % p-value adjusted p-value lda score sample size
p__Euryarchaeota phylum 0.027 0.020 0.265 0.716 0.000 0.003 4.046 104
c__Clostridia class 65.387 11.444 75.710 10.553 0.003 0.011 4.718 104
c__Methanobacteria class 0.027 0.020 0.264 0.716 0.000 0.002 4.251 104
o__Clostridiales order 65.378 11.441 75.747 10.525 0.003 0.010 4.774 104
o__Methanobacteriales order 0.027 0.020 0.264 0.716 0.000 0.002 4.199 104
f__u(o__Clostridiales) family 5.738 2.017 13.353 5.348 0.000 0.000 4.628 104
f__Lachnospiraceae family 20.389 5.092 27.477 9.895 0.009 0.037 4.586 104
f__Bacteroidaceae family 1.645 3.409 7.065 5.511 0.000 0.001 4.473 104
f__Methanobacteriaceae family 0.027 0.020 0.265 0.718 0.000 0.002 3.896 104
g__u(o__Clostridiales) genus 5.738 2.017 13.391 5.360 0.000 0.000 4.580 104
g__Bacteroides genus 1.645 3.409 7.083 5.521 0.000 0.002 4.460 104
g__u(f__Lachnospiraceae) genus 8.051 4.020 12.777 6.537 0.004 0.019 4.353 104
g__Adlercreutzia genus 0.002 0.006 0.135 0.323 0.000 0.002 3.850 104
g__Oscillospira genus 0.527 0.204 0.989 0.599 0.003 0.017 3.828 104
g__Anaerostipes genus 0.073 0.093 0.222 0.362 0.007 0.031 3.717 104
g__Lachnobacterium genus 0.040 0.028 0.351 0.802 0.000 0.004 3.676 104
g__Methanobrevibacter genus 0.027 0.020 0.234 0.647 0.000 0.002 3.430 104
s__u(o__Clostridiales) species 5.738 2.017 13.429 5.370 0.000 0.000 4.586 104
s__u(f__Lachnospiraceae) species 8.051 4.020 12.814 6.553 0.003 0.014 4.370 104
s__u(g__Bacteroides) species 1.295 3.028 5.392 4.359 0.000 0.002 4.316 104
s__u(g__Adlercreutzia) species 0.002 0.006 0.135 0.323 0.000 0.001 3.899 104
g__Blautia s__obeum species 0.029 0.027 1.055 1.153 0.000 0.000 3.786 104
s__u(g__Clostridium) species 0.304 0.156 1.054 0.930 0.000 0.000 3.684 104
g__Bacteroides s__uniformis species 0.129 0.225 0.905 1.234 0.000 0.002 3.655 104
g__[Ruminococcus] s__torques species 0.009 0.013 0.260 0.497 0.000 0.000 3.545 104
g__Bacteroides s__ovatus species 0.029 0.038 0.323 0.553 0.001 0.003 3.529 104
s__u(g__Oscillospira) species 0.527 0.204 0.992 0.602 0.003 0.012 3.519 104
s__u(g__Anaerostipes) species 0.073 0.093 0.223 0.363 0.007 0.023 3.498 104
g__Bacteroides s__caccae species 0.002 0.006 0.076 0.179 0.005 0.019 3.492 104
s__u(g__Lactococcus) species 0.005 0.012 0.209 0.570 0.000 0.000 3.462 104
s__u(g__Lachnobacterium) species 0.040 0.028 0.351 0.802 0.000 0.003 3.343 104
s__u(g__Methanobrevibacter) species 0.027 0.020 0.235 0.648 0.000 0.001 3.242 104

Overpresented in group: user_data

taxon taxa level user_data mean, % user_data sd, % external_data mean, % external_data sd, % p-value adjusted p-value lda score sample size
p__Actinobacteria phylum 7.824 6.401 3.452 2.906 0.001 0.005 4.470 104
c__Erysipelotrichi class 9.098 7.731 2.194 2.200 0.000 0.000 4.728 104
c__Coriobacteriia class 3.756 2.529 1.652 1.441 0.001 0.006 4.415 104
o__Erysipelotrichales order 9.098 7.731 2.195 2.200 0.000 0.000 4.826 104
o__Coriobacteriales order 3.756 2.529 1.653 1.442 0.001 0.006 4.424 104
f__Erysipelotrichaceae family 9.098 7.731 2.198 2.204 0.000 0.000 4.645 104
f__Lactobacillaceae family 3.013 5.801 0.066 0.274 0.000 0.000 4.545 104
f__Coriobacteriaceae family 3.756 2.529 1.654 1.443 0.001 0.007 4.283 104
g__Catenibacterium genus 5.987 6.993 0.384 1.111 0.002 0.015 4.448 104
g__Lactobacillus genus 3.011 5.801 0.065 0.274 0.000 0.000 4.317 104
g__[Ruminococcus] genus 1.727 0.793 0.700 0.926 0.010 0.040 3.950 104
g__u(f__Coriobacteriaceae) genus 1.836 1.435 0.720 0.858 0.006 0.026 3.813 104
s__u(g__Catenibacterium) species 5.987 6.993 0.386 1.116 0.002 0.011 4.470 104
s__u(g__[Ruminococcus]) species 1.402 0.680 0.174 0.349 0.000 0.000 4.084 104
g__Blautia s__producta species 1.282 0.810 0.227 0.450 0.007 0.022 3.991 104
s__u(f__Coriobacteriaceae) species 1.836 1.435 0.723 0.861 0.006 0.021 3.823 104
s__u(g__Coprococcus) species 1.240 0.635 0.753 0.485 0.009 0.027 3.709 104

Cladogram of differences

Tree-like summary of the taxa differentially abundant in two groups constructed using LefSe.

List of differentially abundant taxa

increased in user_data

denotation feature
f g__u(f__Coriobacteriaceae)
g f__Coriobacteriaceae
h o__Coriobacteriales
i c__Coriobacteriia
l g__Lactobacillus
m f__Lactobacillaceae
p g__[Ruminococcus]
x g__Catenibacterium
y f__Erysipelotrichaceae
z o__Erysipelotrichales
a0 c__Erysipelotrichi

increased in external_data

denotation feature
a g__Methanobrevibacter
b f__Methanobacteriaceae
c o__Methanobacteriales
d c__Methanobacteria
e g__Adlercreutzia
j g__Bacteroides
k f__Bacteroidaceae
n g__Anaerostipes
o g__Lachnobacterium
q g__u(f__Lachnospiraceae)
r f__Lachnospiraceae
s g__Oscillospira
t g__u(o__Clostridiales)
u f__u(o__Clostridiales)
v o__Clostridiales
w c__Clostridia

Excluded features

phylum

p__[Thermi], p__Fusobacteria, p__Synergistetes, p__TM7, p__Cyanobacteria

class

c__Verruco-5, c__Chloroplast, c__Fusobacteriia, c__Flavobacteriia, c__4C0d-2, c__Deinococci, c__Synergistia, c__TM7-3

order

o__Oceanospirillales, o__Stramenopiles, o__Rhizobiales, o__YS2, o__Xanthomonadales, o__u(c__Clostridia), o__Synergistales, o__SHA-98, o__Gemellales, o__Fusobacteriales, o__Flavobacteriales, o__Thermales, o__Sphingomonadales, o__Aeromonadales, o__Alteromonadales, o__Pasteurellales, o__u(c__TM7-3), o__Actinomycetales, o__Pseudomonadales, o__Caulobacterales, o__RF32, o__Streptophyta, o__WCHB1-41, o__Chromatiales, o__Bacillales

family

f__Pseudomonadaceae, f__RFP12, f__Halomonadaceae, f__Oxalobacteraceae, f__Peptostreptococcaceae, f__[Chromatiaceae], f__Synergistaceae, f__u(o__Streptophyta), f__u(o__RF32), f__Dehalobacteriaceae, f__Micrococcaceae, f__RF16, f__Thermaceae, f__Planococcaceae, f__Microbacteriaceae, f__Actinomycetaceae, f__Xanthomonadaceae, f__u(c__TM7-3), f__[Tissierellaceae], f__Gemellaceae, f__Brucellaceae, f__u(o__Bacteroidales), f__Carnobacteriaceae, f__Dethiosulfovibrionaceae, f__[Exiguobacteraceae], f__Succinivibrionaceae, f__Moraxellaceae, f__u(c__Clostridia), f__[Weeksellaceae], f__Ectothiorhodospiraceae, f__Comamonadaceae, f__u(o__YS2), f__Sphingomonadaceae, f__u(o__Stramenopiles), f__Streptomycetaceae, f__Beijerinckiaceae, f__Propionibacteriaceae, f__Caulobacteraceae, f__Methylobacteriaceae, f__Intrasporangiaceae, f__u(o__SHA-98), f__u(o__Bacillales), f__Pasteurellaceae, f__Rhizobiaceae, f__Fusobacteriaceae, f__Idiomarinaceae, f__Bradyrhizobiaceae, f__Corynebacteriaceae, f__Eubacteriaceae, f__Bacillaceae, f__Aerococcaceae, f__Staphylococcaceae, f__Leuconostocaceae, f__Alteromonadaceae, f__Christensenellaceae

genus

g__u(f__Streptomycetaceae), g__u(o__RF32), g__Methylobacterium, g__[Prevotella], g__Acinetobacter, g__Holdemania, g__Blastomonas, g__u(f__Prevotellaceae), g__u(f__Caulobacteraceae), g__Bulleidia, g__Mycetocola, g__u(f__Idiomarinaceae), g__u(f__Microbacteriaceae), g__Propionibacterium, g__u(f__Comamonadaceae), g__u(f__[Paraprevotellaceae]), g__Bacillus, g__Kaistobacter, g__Ochrobactrum, g__Rheinheimera, g__Granulicatella, g__Mitsuokella, g__Veillonella, g__Anaerococcus, g__Brevundimonas, g__Sarcina, g__p-75-a5, g__Butyricimonas, g__Chryseobacterium, g__Idiomarina, g__Erwinia, g__u(f__Peptococcaceae), g__u(f__Beijerinckiaceae), g__u(f__Veillonellaceae), g__WAL_1855D, g__Exiguobacterium, g__Bilophila, g__u(f__Sphingomonadaceae), g__u(f__Peptostreptococcaceae), g__Haemophilus, g__Porphyromonas, g__Peptoniphilus, g__u(f__Bacillaceae), g__Sphingomonas, g__u(f__Streptococcaceae), g__Christensenella, g__Succinivibrio, g__Gardnerella, g__Shinella, g__Methanosphaera, g__u(f__Leuconostocaceae), g__Pyramidobacter, g__cc_115, g__u(f__Intrasporangiaceae), g__Serratia, g__Staphylococcus, g__Corynebacterium, g__u(f__Planococcaceae), g__u(f__Micrococcaceae), g__u(o__YS2), g__Thermus, g__Peptostreptococcus, g__u(f__Ectothiorhodospiraceae), g__Butyrivibrio, g__u(f__Enterococcaceae), g__Nesterenkonia, g__u(c__TM7-3), g__u(f__Aerococcaceae), g__Parvimonas, g__u(f__Gemellaceae), g__Eggerthella, g__u(f__RF16), g__u(o__Streptophyta), g__Dehalobacterium, g__Mogibacterium, g__Megasphaera, g__Vagococcus, g__u(f__Oxalobacteraceae), g__u(o__Stramenopiles), g__Pseudoramibacter_Eubacterium, g__u(f__Actinomycetaceae), g__u(f__Lactobacillaceae), g__Epulopiscium, g__Comamonas, g__Pseudobutyrivibrio, g__Marinobacter, g__u(o__Bacteroidales), g__Pseudomonas, g__Shuttleworthia, g__Halomonas, g__Coprobacillus, g__u(f__Pseudomonadaceae), g__Atopobium, g__Rothia, g__Delftia, g__02d06, g__Leuconostoc, g__Fusobacterium, g__rc4-4, g__Acidaminococcus, g__Stenotrophomonas, g__ph2, g__u(f__RFP12), g__u(o__SHA-98), g__Megamonas, g__Odoribacter, g__Bradyrhizobium, g__Streptomyces, g__Ralstonia, g__Anaerotruncus, g__u(c__Clostridia), g__Actinomyces, g__u(o__Bacillales), g__Slackia

species

g__Propionibacterium s__acnes, s__u(g__Serratia), s__u(g__Pseudoramibacter_Eubacterium), g__Lactobacillus s__mucosae, s__u(g__Megasphaera), s__u(f__Sphingomonadaceae), s__u(g__ph2), s__u(g__Blastomonas), s__u(c__Clostridia), s__u(g__Actinomyces), s__u(g__Vagococcus), s__u(g__Idiomarina), g__Haemophilus s__parainfluenzae, s__u(g__Coprobacillus), g__Roseburia s__faecis, s__u(g__Halomonas), g__Eggerthella s__lenta, s__u(g__Atopobium), s__u(f__Comamonadaceae), s__u(f__RFP12), s__u(g__Anaerococcus), s__u(f__Enterococcaceae), s__u(g__Peptoniphilus), s__u(f__Veillonellaceae), s__u(g__Lactobacillus), g__Bifidobacterium s__bifidum, g__Stenotrophomonas s__geniculata, g__Lactobacillus s__vaginalis, s__u(f__Pseudomonadaceae), s__u(g__Odoribacter), g__Bifidobacterium s__longum, s__u(f__Micrococcaceae), s__u(g__02d06), s__u(f__Planococcaceae), s__u(g__Christensenella), g__Coprobacillus s__cateniformis, s__u(g__Marinobacter), s__u(f__Beijerinckiaceae), g__Clostridium s__perfringens, s__u(f__Peptococcaceae), g__Desulfovibrio s__D168, g__Halomonas s__nitritophilus, s__u(o__Streptophyta), s__u(g__Butyrivibrio), s__u(g__Nesterenkonia), s__u(g__Shinella), s__u(f__Actinomycetaceae), g__Bulleidia s__p-1630-c5, s__u(g__Acidaminococcus), g__Brevundimonas s__diminuta, s__u(g__Megamonas), g__[Eubacterium] s__dolichum, s__u(g__Granulicatella), s__u(g__WAL_1855D), s__u(g__Delftia), s__u(g__Holdemania), s__u(g__Ochrobactrum), s__u(g__Kaistobacter), s__u(g__Bilophila), s__u(o__SHA-98), g__Ruminococcus s__bromii, g__Rothia s__aeria, g__Pyramidobacter s__piscolens, s__u(o__Stramenopiles), s__u(g__Succinivibrio), s__u(g__Pseudomonas), s__u(f__Peptostreptococcaceae), s__u(g__Sarcina), s__u(g__Dehalobacterium), s__u(o__YS2), g__Lactobacillus s__ruminis, s__u(o__Bacteroidales), s__u(g__Methanosphaera), s__u(g__Erwinia), s__u(f__Aerococcaceae), g__Bacteroides s__eggerthii, s__u(g__Veillonella), g__Veillonella s__dispar, s__u(f__RF16), s__u(g__Mogibacterium), g__Peptostreptococcus s__anaerobius, s__u(g__p-75-a5), s__u(g__Pseudobutyrivibrio), s__u(f__Streptococcaceae), s__u(o__RF32), g__Bulleidia s__moorei, s__u(g__Haemophilus), s__u(g__Acinetobacter), s__u(g__Anaerotruncus), g__Rothia s__mucilaginosa, g__Acinetobacter s__lwoffii, s__u(g__Bacillus), g__Veillonella s__parvula, s__u(g__Streptomyces), s__u(f__Intrasporangiaceae), s__u(g__Butyricimonas), s__u(f__Oxalobacteraceae), g__Streptococcus s__anginosus, s__u(g__Collinsella), s__u(g__Faecalibacterium), g__Staphylococcus s__sciuri, s__u(g__rc4-4), s__u(f__Bacillaceae), s__u(o__Bacillales), s__u(f__[Paraprevotellaceae]), g__Staphylococcus s__aureus, s__u(f__Ectothiorhodospiraceae), s__u(c__TM7-3), s__u(f__Streptomycetaceae), s__u(f__Leuconostocaceae), g__Clostridium s__hiranonis, s__u(g__Mitsuokella), g__Parabacteroides s__distasonis, s__u(g__Porphyromonas), g__Acinetobacter s__guillouiae, s__u(g__Methylobacterium), s__u(f__Gemellaceae), s__u(g__Parvimonas), s__u(f__Microbacteriaceae), s__u(g__Chryseobacterium), g__Streptococcus s__minor, s__u(g__Comamonas), s__u(g__Staphylococcus), s__u(g__Sphingomonas), s__u(g__Exiguobacterium), s__u(g__Shuttleworthia), s__u(f__Prevotellaceae), s__u(f__Lactobacillaceae), s__u(g__Epulopiscium), s__u(g__Leuconostoc), s__u(f__Caulobacteraceae), s__u(g__Corynebacterium), g__Lactococcus s__garvieae, s__u(g__Bradyrhizobium), g__Erwinia s__dispersa, s__u(g__Ralstonia), s__u(g__Rheinheimera), g__Pseudomonas s__stutzeri, s__u(g__Bulleidia), g__Ruminococcus s__flavefaciens, s__u(g__Gardnerella), s__u(g__Thermus), s__u(g__cc_115), g__Mitsuokella s__multacida, s__u(g__Mycetocola), g__[Eubacterium] s__cylindroides, s__u(f__Idiomarinaceae), g__Lactobacillus s__reuteri, g__Streptococcus s__sobrinus, s__u(g__Fusobacterium), s__u(g__Slackia)

All results of the test

Model details

trait state
distribution zero-inflated negative binomial
formula feature_abundance ~ case_control
link function log
number of samples 104

Functional composition

Individual pathways and reactions for which relative abundance is significantly different between user and external data are identified.

Wilcoxon test comparison

Method: Wilcoxon rank-sum test. The analysis includes the following steps: filtration of rare taxa (taxon must be present in at least 10% of the samples at the level of >0.2%), Wilcoxon rank-sum test applied to each taxon to detect the taxa differentially abundant between the user and external data. Multiple testing adjustment is performed using Benjamini–Hochberg procedure. Contribution of each taxon to the inter-group difference is estimated using LDA method. Samples-outliers listed in the taxonomic composition section were excluded from this analysis.

Differentially abundant taxa

Tables of differentially abundant taxa overpresented in the groups

Overpresented in group: external_data

pathway metabolic level user_data median, % external_data median, % p-value adjusted p-value lda score
ko03020 : RNA polymerase KEGG pathways 2.117 2.250 0.000 0.005 2.984
ko00400 : Phenylalanine, tyrosine and tryptophan biosynthesis KEGG pathways 2.121 2.210 0.001 0.010 2.763
ko00511 : Other glycan degradation KEGG pathways 0.220 0.265 0.000 0.003 2.710
ko00600 : Sphingolipid metabolism KEGG pathways 0.331 0.410 0.001 0.010 2.694
ko00330 : Arginine and proline metabolism KEGG pathways 2.316 2.402 0.006 0.039 2.665
ko00523 : Polyketide sugar unit biosynthesis KEGG pathways 0.429 0.451 0.009 0.045 2.578
ko00620 : Pyruvate metabolism KEGG pathways 0.809 0.876 0.006 0.039 2.504
ko00770 : Pantothenate and CoA biosynthesis KEGG pathways 1.748 1.786 0.007 0.039 2.472
ko00780 : Biotin metabolism KEGG pathways 0.403 0.435 0.007 0.039 2.425

Overpresented in group: user_data

pathway metabolic level user_data median, % external_data median, % p-value adjusted p-value lda score
ko02060 : Phosphotransferase system (PTS) KEGG pathways 1.550 1.137 0.000 0.006 3.454
ko00010 : KEGG pathways 0.484 0.439 0.000 0.003 2.910
ko04070 : Phosphatidylinositol signaling system KEGG pathways 0.255 0.227 0.001 0.012 2.773
ko00970 : Aminoacyl-tRNA biosynthesis KEGG pathways 3.641 3.542 0.001 0.012 2.761
ko03030 : DNA replication KEGG pathways 0.563 0.533 0.000 0.006 2.627
ko00260 : Glycine, serine and threonine metabolism KEGG pathways 0.244 0.216 0.009 0.045 2.498
ko00250 : Alanine, aspartate and glutamate metabolism KEGG pathways 1.042 1.016 0.010 0.048 2.441

Cladogram of differences

Tree-like summary of the taxa differentially abundant in two groups constructed using LefSe.

List of differentially abundant taxa

increased in user_data

denotation feature
a ko00010
b ko00250
c ko00260
l ko00970
m ko02060
o ko03030
p ko04070

increased in external_data

denotation feature
d ko00330
e ko00400
f ko00511
g ko00523
h ko00600
i ko00620
j ko00770
k ko00780
n ko03020

Excluded features

KEGG pathways

ko00020 : Citrate cycle (TCA cycle), ko00053 : Ascorbate and aldarate metabolism, ko00062 : Fatty acid elongation, ko00071 : Fatty acid degradation, ko00100 : Steroid biosynthesis, ko00120 : Primary bile acid biosynthesis, ko00121 : Secondary bile acid biosynthesis, ko00140 : Steroid hormone biosynthesis, ko00196 : Photosynthesis - antenna proteins, ko00232 : Caffeine metabolism, ko00253 : Tetracycline biosynthesis, ko00280 : Valine, leucine and isoleucine degradation, ko00281 : Geraniol degradation, ko00310 : Lysine degradation, ko00311 : Penicillin and cephalosporin biosynthesis, ko00312 : , ko00331 : Clavulanic acid biosynthesis, ko00350 : Tyrosine metabolism, ko00360 : Phenylalanine metabolism, ko00361 : Chlorocyclohexane and chlorobenzene degradation, ko00362 : Benzoate degradation, ko00363 : Bisphenol degradation, ko00364 : Fluorobenzoate degradation, ko00380 : Tryptophan metabolism, ko00401 : Novobiocin biosynthesis, ko00410 : beta-Alanine metabolism, ko00430 : Taurine and hypotaurine metabolism, ko00440 : Phosphonate and phosphinate metabolism, ko00460 : Cyanoamino acid metabolism, ko00471 : D-Glutamine and D-glutamate metabolism, ko00472 : D-Arginine and D-ornithine metabolism, ko00473 : D-Alanine metabolism, ko00510 : N-Glycan biosynthesis, ko00513 : Various types of N-glycan biosynthesis, ko00514 : Other types of O-glycan biosynthesis, ko00521 : Streptomycin biosynthesis, ko00522 : Biosynthesis of 12-, 14- and 16-membered macrolides, ko00531 : Glycosaminoglycan degradation, ko00532 : Glycosaminoglycan biosynthesis - chondroitin sulfate / dermatan sulfate, ko00534 : Glycosaminoglycan biosynthesis - heparan sulfate / heparin, ko00562 : Inositol phosphate metabolism, ko00563 : Glycosylphosphatidylinositol (GPI)-anchor biosynthesis, ko00565 : Ether lipid metabolism, ko00590 : Arachidonic acid metabolism, ko00591 : Linoleic acid metabolism, ko00592 : alpha-Linolenic acid metabolism, ko00601 : Glycosphingolipid biosynthesis - lacto and neolacto series, ko00604 : Glycosphingolipid biosynthesis - ganglio series, ko00621 : Dioxin degradation, ko00622 : Xylene degradation, ko00623 : Toluene degradation, ko00625 : Chloroalkane and chloroalkene degradation, ko00626 : Naphthalene degradation, ko00627 : Aminobenzoate degradation, ko00633 : Nitrotoluene degradation, ko00642 : Ethylbenzene degradation, ko00643 : Styrene degradation, ko00785 : Lipoic acid metabolism, ko00791 : Atrazine degradation, ko00830 : Retinol metabolism, ko00901 : Indole alkaloid biosynthesis, ko00905 : Brassinosteroid biosynthesis, ko00906 : Carotenoid biosynthesis, ko00908 : Zeatin biosynthesis, ko00909 : Sesquiterpenoid and triterpenoid biosynthesis, ko00930 : Caprolactam degradation, ko00941 : Flavonoid biosynthesis, ko00943 : Isoflavonoid biosynthesis, ko00945 : Stilbenoid, diarylheptanoid and gingerol biosynthesis, ko00950 : Isoquinoline alkaloid biosynthesis, ko00965 : Betalain biosynthesis, ko00980 : Metabolism of xenobiotics by cytochrome P450, ko00982 : Drug metabolism - cytochrome P450, ko01053 : Biosynthesis of siderophore group nonribosomal peptides, ko01055 : Biosynthesis of vancomycin group antibiotics, ko01056 : Biosynthesis of type II polyketide backbone, ko01057 : Biosynthesis of type II polyketide products, ko03015 : mRNA surveillance pathway, ko03050 : Proteasome, ko03450 : Non-homologous end-joining

Generalized linear mixed effect model

A generalized linear model is fitted for each taxon to identify if it is differentially abundant between the user and context data. The specific probability distribution is selected heuristically depending on the number of samples. For >100 samples, a zero-inflated negative binomial regression is fitted; in other cases - a negative binomial model. Rare taxa are excluded from the analysis (a taxon must be present in at least 10% of the samples at the level of >0.2%). Multiple testing adjustment is performed using Benjamini–Hochberg procedure. Contribution of each taxon to the inter-group difference is estimated using LDA method. The information about distribution family, terms of the model and sample size is displayed in "Model details" section.

Differentially abundant taxa

Tables of differentially abundant taxa overpresented in the groups

Overpresented in group: external_data

pathway metabolic level user_data mean, % user_data sd, % external_data mean, % external_data sd, % p-value adjusted p-value lda score sample size
ko00600 : Sphingolipid metabolism KEGG pathways 0.339 0.060 0.416 0.063 0.000 0.003 3.521 104
ko03020 : RNA polymerase KEGG pathways 2.082 0.128 2.256 0.124 0.000 0.001 3.488 104
ko00511 : Other glycan degradation KEGG pathways 0.215 0.030 0.269 0.039 0.000 0.001 3.449 104
ko00400 : Phenylalanine, tyrosine and tryptophan biosynthesis KEGG pathways 2.114 0.082 2.215 0.087 0.000 0.005 3.251 104
ko00523 : Polyketide sugar unit biosynthesis KEGG pathways 0.417 0.038 0.449 0.030 0.002 0.013 3.200 104
ko00770 : Pantothenate and CoA biosynthesis KEGG pathways 1.752 0.037 1.788 0.035 0.002 0.013 3.061 104

Overpresented in group: user_data

pathway metabolic level user_data mean, % user_data sd, % external_data mean, % external_data sd, % p-value adjusted p-value lda score sample size
ko02060 : Phosphotransferase system (PTS) KEGG pathways 1.725 0.490 1.189 0.232 0.000 0.000 4.013 104
ko00010 : KEGG pathways 0.515 0.075 0.441 0.023 0.000 0.000 3.488 104
ko00970 : Aminoacyl-tRNA biosynthesis KEGG pathways 3.641 0.080 3.543 0.098 0.002 0.013 3.154 104
ko00903 : Limonene and pinene degradation KEGG pathways 0.194 0.029 0.173 0.019 0.002 0.013 3.131 104
ko03030 : DNA replication KEGG pathways 0.559 0.017 0.533 0.025 0.001 0.013 3.120 104
ko04070 : Phosphatidylinositol signaling system KEGG pathways 0.248 0.017 0.229 0.017 0.001 0.009 3.116 104

Cladogram of differences

Tree-like summary of the taxa differentially abundant in two groups constructed using LefSe.

List of differentially abundant taxa

increased in user_data

denotation feature
a ko00010
g ko00903
h ko00970
i ko02060
k ko03030
l ko04070

increased in external_data

denotation feature
b ko00400
c ko00511
d ko00523
e ko00600
f ko00770
j ko03020

Excluded features

KEGG pathways

ko00941 : Flavonoid biosynthesis, ko00020 : Citrate cycle (TCA cycle), ko00253 : Tetracycline biosynthesis, ko00901 : Indole alkaloid biosynthesis, ko00350 : Tyrosine metabolism, ko00312 : , ko00361 : Chlorocyclohexane and chlorobenzene degradation, ko00522 : Biosynthesis of 12-, 14- and 16-membered macrolides, ko00514 : Other types of O-glycan biosynthesis, ko01053 : Biosynthesis of siderophore group nonribosomal peptides, ko00430 : Taurine and hypotaurine metabolism, ko00950 : Isoquinoline alkaloid biosynthesis, ko00563 : Glycosylphosphatidylinositol (GPI)-anchor biosynthesis, ko00592 : alpha-Linolenic acid metabolism, ko00071 : Fatty acid degradation, ko00196 : Photosynthesis - antenna proteins, ko00562 : Inositol phosphate metabolism, ko00909 : Sesquiterpenoid and triterpenoid biosynthesis, ko00633 : Nitrotoluene degradation, ko00532 : Glycosaminoglycan biosynthesis - chondroitin sulfate / dermatan sulfate, ko00621 : Dioxin degradation, ko00601 : Glycosphingolipid biosynthesis - lacto and neolacto series, ko00472 : D-Arginine and D-ornithine metabolism, ko00625 : Chloroalkane and chloroalkene degradation, ko00360 : Phenylalanine metabolism, ko00280 : Valine, leucine and isoleucine degradation, ko00982 : Drug metabolism - cytochrome P450, ko00643 : Styrene degradation, ko00460 : Cyanoamino acid metabolism, ko03050 : Proteasome, ko00401 : Novobiocin biosynthesis, ko00906 : Carotenoid biosynthesis, ko00642 : Ethylbenzene degradation, ko00534 : Glycosaminoglycan biosynthesis - heparan sulfate / heparin, ko00310 : Lysine degradation, ko00590 : Arachidonic acid metabolism, ko00471 : D-Glutamine and D-glutamate metabolism, ko00363 : Bisphenol degradation, ko00905 : Brassinosteroid biosynthesis, ko00531 : Glycosaminoglycan degradation, ko03015 : mRNA surveillance pathway, ko00053 : Ascorbate and aldarate metabolism, ko00364 : Fluorobenzoate degradation, ko01056 : Biosynthesis of type II polyketide backbone, ko00410 : beta-Alanine metabolism, ko00591 : Linoleic acid metabolism, ko01055 : Biosynthesis of vancomycin group antibiotics, ko00140 : Steroid hormone biosynthesis, ko00626 : Naphthalene degradation, ko00930 : Caprolactam degradation, ko00510 : N-Glycan biosynthesis, ko00565 : Ether lipid metabolism, ko00362 : Benzoate degradation, ko00908 : Zeatin biosynthesis, ko00830 : Retinol metabolism, ko00627 : Aminobenzoate degradation, ko00473 : D-Alanine metabolism, ko00281 : Geraniol degradation, ko00943 : Isoflavonoid biosynthesis, ko00791 : Atrazine degradation, ko00232 : Caffeine metabolism, ko00980 : Metabolism of xenobiotics by cytochrome P450, ko00965 : Betalain biosynthesis, ko00521 : Streptomycin biosynthesis, ko00513 : Various types of N-glycan biosynthesis, ko00785 : Lipoic acid metabolism, ko00120 : Primary bile acid biosynthesis, ko00945 : Stilbenoid, diarylheptanoid and gingerol biosynthesis, ko00380 : Tryptophan metabolism, ko00100 : Steroid biosynthesis, ko01057 : Biosynthesis of type II polyketide products, ko00331 : Clavulanic acid biosynthesis, ko00062 : Fatty acid elongation, ko00311 : Penicillin and cephalosporin biosynthesis, ko03450 : Non-homologous end-joining, ko00440 : Phosphonate and phosphinate metabolism, ko00604 : Glycosphingolipid biosynthesis - ganglio series, ko00623 : Toluene degradation, ko00622 : Xylene degradation, ko00121 : Secondary bile acid biosynthesis

Model details

trait state
distribution gaussian
formula feature_abundance ~ case_control
number of samples 104
transform arcsin(sqrt)

Specific pathways

Individual pathways and reactions for which relative abundance is significantly different between user and external data are identified.

Wilcoxon test comparison

Method: Wilcoxon rank-sum test. The analysis includes the following steps: filtration of rare taxa (taxon must be present in at least 10% of the samples at the level of >0.2%), Wilcoxon rank-sum test applied to each taxon to detect the taxa differentially abundant between the user and external data. Multiple testing adjustment is performed using Benjamini–Hochberg procedure. Contribution of each taxon to the inter-group difference is estimated using LDA method. Samples-outliers listed in the taxonomic composition section were excluded from this analysis.

Differentially abundant taxa

Tables of differentially abundant taxa overpresented in the groups

Overpresented in group: external_data

pathway metabolic level user_data median, % external_data median, % p-value adjusted p-value lda score
B5_a vitamin 0.87 0.937 0.00 0.000 2.706
B1_a vitamin 0.57 0.597 0.01 0.043 2.335

Cladogram of differences

Tree-like summary of the taxa differentially abundant in two groups constructed using LefSe.

List of differentially abundant taxa

increased in external_data

denotation feature
a B1_a
b B5_a

Excluded features

propionate

Succinate_a, Succinate_b, acrylate_a, propanediol_a

vitamin

B6_a, B6_b

Generalized linear mixed effect model

A generalized linear model is fitted for each taxon to identify if it is differentially abundant between the user and context data. The specific probability distribution is selected heuristically depending on the number of samples. For >100 samples, a zero-inflated negative binomial regression is fitted; in other cases - a negative binomial model. Rare taxa are excluded from the analysis (a taxon must be present in at least 10% of the samples at the level of >0.2%). Multiple testing adjustment is performed using Benjamini–Hochberg procedure. Contribution of each taxon to the inter-group difference is estimated using LDA method. The information about distribution family, terms of the model and sample size is displayed in "Model details" section.

Differentially abundant taxa

Tables of differentially abundant taxa overpresented in the groups

Overpresented in group: external_data

pathway metabolic level user_data mean, % user_data sd, % external_data mean, % external_data sd, % p-value adjusted p-value lda score sample size
B5_a vitamin 0.867 0.043 0.932 0.040 0.00 0.000 3.381 104
B1_a vitamin 0.570 0.034 0.602 0.037 0.01 0.044 3.111 104

Cladogram of differences

Tree-like summary of the taxa differentially abundant in two groups constructed using LefSe.

List of differentially abundant taxa

increased in external_data

denotation feature
a B1_a
b B5_a

Excluded features

vitamin

B6_b, B6_a

Model details

trait state
distribution gaussian
formula feature_abundance ~ case_control
number of samples 104
transform arcsin(sqrt)

Reconstruction of metabolic potential

Predicted functional composition of microbiota.

Vitamins synthesis

Gut microbes are known to produce a number of vitamins. The boxplots represent median, standard deviation and quartiles of the vitamin biosynthesis pathways in the samples.

Plots

Total relative abundance of the genes involved in vitamins biosynthesis summed across the respective pathways.

Nothing to show

Description of pathways

Nothing to show

Synthesis of short-chain fatty acids (SCFAs)

Gut microbes are known to produce SCFAs. The boxplots represent median, standard deviation and quartiles of the SCFAs biosynthesis pathways in the samples.

Synthesis of butyrate

Plots

Total relative abundance of the genes involved in butyrate synthesis summed across the respective pathways.

Nothing to show

Description of pathways

Nothing to show

Synthesis of propionate

Plots

Total relative abundance of the genes involved in propionate synthesis summed across the respective pathways.

Nothing to show

Description of pathways

Nothing to show

datalab: 3.10.0
knb_lib: 4.8.40
knb_interactive: 2.0.2