Report summary
Perform preprocessing of user data and analyze essential taxonomic and functional composition (without analysis of factors = meta-data about the samples).
Created | 23/03/2020 |
---|---|
Updated | 30/04/2020 |
Type | Basic report |
Project | blinded Beer total, 16S v.5 |
Uploaded samples | 20 |
Data quality
Assessment of raw data quality.
Number of reads
Read quantity distribution
Number of the reads per sample before and after the quality filtering. Quality filtering (using split_libraries_fastq.py QIIME script) included: trimming of low-quality read ends (quality threshold = 20) and discarding of trimmed reads shorter than 75% of the initial length. Vertical line denotes minimal number of reads (3000 reads).
Samples with low coverage
List of samples that had insufficient number of high-quality reads after the quality filtering (< 3000 reads) and were excluded from further analysis.
All samples passed the filter.
Read classification statistics
The reads were denoised using Deblur (target read length parameter value was determined as the most frequent read length across all samples). Taxonomy was assigned using a scikit-learn naive Bayes machine-learning classifier from QIIME2. The classifier was trained on Greengenes database v. 13.5, 97% OTU similarity. No subsequent filtering by minimum fraction of mapped reads is performed.
Proportion of classified reads
Taxonomic composition
Heatmap of taxonomic composition
The interactive heatmap represents relative abundance of major microbial taxa (columns) in the samples (rows). Using the drop-down list “Heatmap settings” on the right of the heatmap, users can select taxonomic rank of interest. For convenience of comparison between close values, clicking on a cell “freezes” the displayed value of cell on the Legend and additionally the displayed abundance of top 10 taxa of corresponding sample (click again or on the cross near sample name to “unfreeze”). Use the Top control to change the way of major composition display between the top features in the selected sample and the top features across all samples on the average.
Major taxa
The boxplots represent distribution of relative abundance for 25 most abundant taxa across all samples (for each taxonomic rank). For proper display on log scale, zero values were replaced with a pseudocount not higher than minimum value of relative abundance of major taxa.
phylum
class
order
family
genus
species
Complete taxonomic composition
The table contains relative abundance of all microbial taxa for each taxonomic rank.
Raw counts
Taxonomic core
The plot represents the proportion of OTUs shared across the varying proportion of samples.
Analysis of outliers
Automatic filtering of the user samples with extreme taxonomic composition (based on the combined analysis of user and external data). Analysis of outliers: samples in upper 1% tail of distribution of median distance between each sample and closest 50% of neighbours approximated by normal distribution. List of outliers:
No outliers detected.
PCoA visualization based on taxonomic composition
Distribution of the samples by their taxonomic composition in reduced dimensionality. The closer the samples (points) on the plot, the more similar their composition. Vectors show the directions in which the levels of the respective major taxa increase. Method of dimension reduction: PCoA (Principal Coordinate Analysis); dissimilarity metric: Bray-Curtis. Clicking on a dot “freezes” the detailed information about the sample on the right of the plot (click again or on the cross near sample name to “unfreeze”). Switch between the display modes with or without outliers and with or without vectors showing major microbial “drivers” using the respective controls.
Enterotypes
Enterotyping (cluster analysis of samples by their composition) was performed according to the original protocol (Arumugam et al, 2011). The optimal number of clusters was determined according to the highest Calinski-Harabasz index. Silhouette width is a measure of the clustering quality. For each of the enterotypes, there is a list of its drivers – microbial taxa distinguishing the samples belonging to the cluster from the other samples.
Number of enterotypes
4
Calinski-Harabasz index
40.52
Average silhouette width of the clusters
0.646
Microbial drivers
Enterotype name: Enterotype 1
Table
taxon | score |
---|---|
k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Leuconostocaceae;g__ | 2.37 |
k__Bacteria;p__;c__;o__;f__;g__ | 1.30 |
k__Bacteria;p__Firmicutes;c__Bacilli;o__Bacillales;f__Thermoactinomycetaceae;g__ | 1.30 |
k__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__ | 1.30 |
k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pasteurellales;f__Pasteurellaceae;g__ | 1.30 |
k__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Paracoccus | 1.08 |
k__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Acetobacteraceae;g__Acetobacter | 0.86 |
k__Bacteria;p__Proteobacteria;c__Deltaproteobacteria;o__Desulfurellales;f__Desulfurellaceae;g__Desulfurella | 0.38 |
k__Bacteria;p__Bacteroidetes;c__[Saprospirae];o__[Saprospirales];f__Chitinophagaceae;g__Sediminibacterium | 0.31 |
k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas | 0.11 |
Enterotype name: Enterotype 2
Table
taxon | score |
---|---|
k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Lactobacillaceae;g__Lactobacillus | 0.98 |
k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Leuconostocaceae;g__Leuconostoc | 0.30 |
k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Burkholderiales;f__Oxalobacteraceae;g__Ralstonia | 0.24 |
k__Bacteria;p__Dictyoglomi;c__Dictyoglomia;o__Dictyoglomales;f__Dictyoglomaceae;g__Dictyoglomus | 0.23 |
k__Bacteria;p__Firmicutes;c__Bacilli;o__Bacillales;f__Bacillaceae;g__ | 0.23 |
k__Bacteria;p__Proteobacteria;c__Deltaproteobacteria;o__Myxococcales;f__0319-6G20;g__ | 0.23 |
k__Bacteria;p__Chloroflexi;c__Anaerolineae;o__envOPS12;f__;g__ | 0.23 |
k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__;g__ | 0.23 |
k__Bacteria;p__Proteobacteria;c__Betaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Burkholderia | 0.23 |
k__Bacteria;p__Firmicutes;c__Bacilli;o__Bacillales;f__Paenibacillaceae;g__Paenibacillus | 0.07 |
Enterotype name: Enterotype 3
Table
taxon | score |
---|---|
k__Bacteria;p__Firmicutes;c__Bacilli;o__Lactobacillales;f__Lactobacillaceae;g__ | 1.93 |
k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__ACK-M1;g__ | 0.92 |
k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Micrococcaceae;g__ | 0.92 |
k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__Dorea | 0.92 |
k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__[Ruminococcus] | 0.92 |
k__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Acetobacteraceae;g__Acidocella | 0.92 |
k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Moraxellaceae;g__Psychrobacter | 0.92 |
k__Bacteria;p__Actinobacteria;c__Thermoleophilia;o__Solirubrobacterales;f__Solirubrobacteraceae;g__ | 0.92 |
k__Bacteria;p__Cyanobacteria;c__Chloroplast;o__Chlorophyta;f__;g__ | 0.92 |
k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__[Tissierellaceae];g__Peptoniphilus | 0.92 |
Enterotype name: Enterotype 4
Table
taxon | score |
---|---|
k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacteriales;f__Enterobacteriaceae;g__ | 1.88 |
k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Luteibacter | 1.83 |
k__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rickettsiales;f__mitochondria;g__ | 1.80 |
k__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas | 1.39 |
k__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Acetobacteraceae;g__Gluconobacter | 1.33 |
k__Bacteria;p__Armatimonadetes;c__Armatimonadia;o__Armatimonadales;f__Armatimonadaceae;g__ | 1.30 |
k__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__;g__ | 1.30 |
k__Bacteria;p__Bacteroidetes;c__;o__;f__;g__ | 1.30 |
k__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__;g__ | 1.30 |
k__Bacteria;p__OP9;c__OPB46;o__OPB72;f__;g__ | 1.30 |
Hierarchical clustering
The tree shows clustering of the samples by similarity of their taxonomic composition at varying levels of detail. Dissimilarity metric: Bray-Curtis; linkage: Ward’s method.
Alpha-diversity
Interactive plot
The measure describes the conditional number of taxa in each sample. Metric: Shannon index. Clicking on a dot “freezes” the displayed value on Y axis and additionally the abundance of top 10 taxa (click on it or on the cross near the sample name to “unfreeze”). In addition, the mean and confidence interval value appear when the mouse is over the boxplot. Controls at the top and bottom-right allow to change the displayed data.
Static plots
Chao1 index
Shannon index
Taxa co-occurence analysis
Co-occurence graph
Co-occurrence of microbial genera was analyzed basing on correlation analysis of their relative abundance using SPIEC-EASI software. In the graph, vertices show genera; pairs of highly co-occurring genera are connected with blue lines. The graph shows the members of the cooperatives - groups of highly co-occurring genera corresponding to isolated components (singleton vertices are omitted). Parameters of SPIEC-EASI algorithm: Meinshausen and Bühlmann neighbourhood selection method (MB), minimum lambda ratio= 0.1, number of lambda iterations = 20, model selection using StARS algorithm (number of StARS subsamples = 50).
Members of the cooperatives
Cooperative content.
No cooperatives detected. Possible reasons: too few or too many co-occurring taxa or insufficient number of samples to perform the analysis.
Abundance of the cooperatives
Relative abundance of each cooperative in the samples.
No cooperatives detected. Possible reasons: too few or too many co-occurring taxa or insufficient number of samples to perform the analysis.
Reconstruction of metabolic potential
Predicted functional composition of microbiota.
Heatmap of functional composition
The interactive heatmap represents relative abundance of major pathways (columns) in the samples (rows). To switch between KEGG or MetaCyc nomenclatures, use the drop-down list in “Heatmap settings”. For convenience of comparison between close values, clicking on a cell “freezes” the displayed value of the cell in the displayed abundance of top features of the sample (click again or on the cross near the sample name to “unfreeze”). Use the Top control to change the way of major composition display between the top features in the selected sample and the top features across all samples on the average.
Vitamins synthesis
Gut microbes are known to produce a number of vitamins. The boxplots represent median, standard deviation and quartiles of the vitamin biosynthesis pathways in the samples.
Pathways
Relative abundance of pathways involved in vitamins synthesis.
Gene groups
Relative abundance of KEGG Ortology gene groups involved in vitamins synthesis.
Plots
Total relative abundance of the genes involved in vitamins biosynthesis summed across the respective pathways.
KEGG pathways
Description of pathways
Complete functional composition
The table contains relative abundance of all functional features.
Synthesis of short-chain fatty acids (SCFAs)
Gut microbes are known to produce SCFAs. The boxplots represent median, standard deviation and quartiles of the SCFAs biosynthesis pathways in the samples.
Synthesis of butyrate
Pathways
Relative abundance of pathways involved in butyrate synthesis.
Gene groups
Relative abundance of KEGG Ortology gene groups involved in butyrate synthesis.
Plots
Total relative abundance of the genes involved in butyrate synthesis summed across the respective pathways.
KEGG pathways
Description of pathways
Synthesis of propionate
Pathways
Relative abundance of pathways involved in propionate synthesis
Gene groups
Relative abundance of KEGG Ortology gene groups involved in propionate synthesis.
Plots
Total relative abundance of the genes involved in propionate synthesis summed across the respective pathways.
KEGG pathways
Description of pathways
All features tables
All calculated features can be downloaded here.
Alpha-diversity data
The table contains alpha-diversity values of all samples.
Complete taxonomic composition
The table contains relative abundance of all microbial taxa for each taxonomic rank.
Raw counts
Complete functional composition
The table contains relative abundance of all functional features.
Beta-diversity data
Table of Bray-Curtis dissimilarities between samples
knb_interactive:
2.0.2
knb_lib:
4.8.50
datalab:
3.10.0