Factor analysis report for Microbiome of Moscow subway: pilot project

Report summary

Explore the associations between microbiota composition of data and external factors provided by the user.

Created04/01/2020
Updated05/01/2020
TypeFactor analysis report
ProjectMicrobiome of Moscow subway: pilot project
Uploaded samples40

Taxonomic composition

Heatmap of taxonomic composition

The interactive heatmap represents relative abundance of major microbial taxa (columns) in the samples (rows). Using the drop-down list “Heatmap settings” on the right of the heatmap, users can select taxonomic rank of interest. For convenience of comparison between close values, clicking on a cell “freezes” the displayed value of cell on the Legend and additionally the displayed abundance of top 10 taxa of corresponding sample (click again or on the cross near sample name to “unfreeze”). Use the Top control to switch the way of major composition display between the top features in the selected sample and the top features across all samples on the average. Controls at the top and bottom-right allow to change the display of rows and columns.

Analysis of outliers

Automatic filtering of the user samples with extreme taxonomic composition (based on the combined analysis of user and all available external datasets). Analysis of outliers: samples in the upper 1% tail of distribution of median distance between each sample and closest 50% of neighbours approximated by normal distribution. List of outliers:

No outliers detected.

PCoA visualization based on taxonomic composition

Distribution of the samples by their taxonomic composition in reduced dimensionality. The closer the samples (points) on the plot, the more similar their composition. Vectors show the directions in which the levels of the respective major taxa increase. Method of dimension reduction: PCoA (Principal Coordinate Analysis); dissimilarity metric: Bray-Curtis. Clicking on a dot “freezes” the detailed information about the sample on the right of the plot (click again or on the cross near sample name to “unfreeze”). Switch between the display modes with or without outliers and with or without vectors showing major microbial “drivers” using the respective controls.

Alpha-diversity

The measure describes the conditional number of taxa in each sample. Metric: Shannon index. Clicking on a dot “freezes” the displayed value on Y axis and additionally the abundance of top 10 taxa (click on it or on the cross near the sample name to “unfreeze”). In addition, the mean and confidence interval value appear when the mouse is over the boxplot. Controls at the top and bottom-right allow to change the displayed data.

Reconstruction of metabolic potential

Predicted functional composition of microbiota.

Vitamins synthesis

Plots of relative abundance by factor

Nothing to show

Description of pathways

Nothing to show

Synthesis of short-chain fatty acids (SCFAs)

Gut microbes are known to produce SCFAs. The boxplots represent median, standard deviation and quartiles of the SCFAs biosynthesis pathways in the samples.

Synthesis of butyrate

Plots of relative abundance by factor

Nothing to show

Description of pathways

Nothing to show

Synthesis of propionate

Plots of relative abundance by factor

Nothing to show

Description of pathways

Nothing to show

Statistical analysis

General difference of community structure between two groups

Test if there are significant differences in overall community composition between the samples of two groups. Method: permutational multivariate analysis of variance (PERMANOVA), beta-diversity metric: weighted UniFrac. The result includes the total number of samples, number of PERMANOVA permutations, p-value for the null hypothesis that there is no difference between the groups, as well as information on the equality of group dispersions (obtained using PERMDISP method with same number of permutations). If the group variations are not equal, the results should be interpreted with caution. Samples-outliers listed in the taxonomic composition section are excluded from this analysis.

The sample size is too small (< 10) to evaluate the test.

General difference of metabolic potential structure between two groups

Test if there are significant differences in overall metabolic structure between the samples of two groups. Method: permutational multivariate analysis of variance (PERMANOVA), beta-diversity metric: Bray-Curtis distance. The result includes the total number of samples, number of PERMANOVA permutations, p-value for the null hypothesis that there is no difference between the groups, as well as information on the equality of group dispersions (obtained using PERMDISP method with same number of permutations). If the group variations are not equal, the results should be interpreted with caution. Samples-outliers listed in the taxonomic composition section are excluded from this analysis.

The sample size is too small (< 10) to evaluate the test.

Taxonomic composition

Individual microbial taxa for which relative abundance is significantly different between two groups are identified.

Generalized linear mixed effect model

A generalized mixed effects linear model is fitted for each taxon to identify associations with each factor from metadata. If on the average there is >50 samples per each fixed factor coefficient then a zero-inflated negative binomial distribution family is used; in other cases - a negative binomial one. Rare taxa are excluded from the analysis (a taxon must be present in at least 10% of the samples at the level of >0.2%). Multiple testing adjustment is performed using Benjamini–Hochberg procedure. The information about distribution family, terms of the model and sample size is displayed in "Model details" section.

Significant results

The column 'coefficient' contains the value of linear model coefficient. Its sign shows the direction of association between a microbial taxon and a factor - positive or negative. If a factor is categorical (for example, group), it is first decomposed into several factors - one per each value/group. Each of these is viewed as a separate factor relative to the first group (sorted by alphabet).

Nothing to show

All results of the test

All features were discarded during the filtering.

Data filtration summary

Information about filtration of factors and features during the analysis

Metadata after NAs removement

Metadata after removement of NAs, factors with unique or all distinct values

Metadata is empty after removing NAs, factors with unique or all distinct values

Excluded features

No features were exluded from analysis

Model details

Nothing to show

Functional composition

Individual microbial taxa for which relative abundance is significantly different between two groups are identified.

Generalized linear mixed effect model

A generalized mixed effects linear model is fitted for each taxon to identify associations with each factor from metadata. If on the average there is >50 samples per each fixed factor coefficient then a zero-inflated negative binomial distribution family is used; in other cases - a negative binomial one. Rare taxa are excluded from the analysis (a taxon must be present in at least 10% of the samples at the level of >0.2%). Multiple testing adjustment is performed using Benjamini–Hochberg procedure. The information about distribution family, terms of the model and sample size is displayed in "Model details" section.

Significant results

The column 'coefficient' contains the value of linear model coefficient. Its sign shows the direction of association between a microbial taxon and a factor - positive or negative. If a factor is categorical (for example, group), it is first decomposed into several factors - one per each value/group. Each of these is viewed as a separate factor relative to the first group (sorted by alphabet).

Nothing to show

All results of the test

All features were discarded during the filtering.

Data filtration summary

Information about filtration of factors and features during the analysis

Metadata after NAs removement

Metadata after removement of NAs, factors with unique or all distinct values

Metadata is empty after removing NAs, factors with unique or all distinct values

Excluded features

No features were exluded from analysis

Model details

Nothing to show

Specific pathways

Individual microbial taxa for which relative abundance is significantly different between two groups are identified.

Generalized linear mixed effect model

A generalized mixed effects linear model is fitted for each taxon to identify associations with each factor from metadata. If on the average there is >50 samples per each fixed factor coefficient then a zero-inflated negative binomial distribution family is used; in other cases - a negative binomial one. Rare taxa are excluded from the analysis (a taxon must be present in at least 10% of the samples at the level of >0.2%). Multiple testing adjustment is performed using Benjamini–Hochberg procedure. The information about distribution family, terms of the model and sample size is displayed in "Model details" section.

Significant results

The column 'coefficient' contains the value of linear model coefficient. Its sign shows the direction of association between a microbial taxon and a factor - positive or negative. If a factor is categorical (for example, group), it is first decomposed into several factors - one per each value/group. Each of these is viewed as a separate factor relative to the first group (sorted by alphabet).

Nothing to show

All results of the test

All features were discarded during the filtering.

Data filtration summary

Information about filtration of factors and features during the analysis

Metadata after NAs removement

Metadata after removement of NAs, factors with unique or all distinct values

Metadata is empty after removing NAs, factors with unique or all distinct values

Excluded features

No features were exluded from analysis

Model details

Nothing to show

Taxa co-occurence analysis

Individual microbial taxa for which relative abundance is significantly different between two groups are identified.

Generalized linear mixed effect model

A generalized mixed effects linear model is fitted for each taxon to identify associations with each factor from metadata. If on the average there is >50 samples per each fixed factor coefficient then a zero-inflated negative binomial distribution family is used; in other cases - a negative binomial one. Rare taxa are excluded from the analysis (a taxon must be present in at least 10% of the samples at the level of >0.2%). Multiple testing adjustment is performed using Benjamini–Hochberg procedure. The information about distribution family, terms of the model and sample size is displayed in "Model details" section.

Significant results

The column 'coefficient' contains the value of linear model coefficient. Its sign shows the direction of association between a microbial taxon and a factor - positive or negative. If a factor is categorical (for example, group), it is first decomposed into several factors - one per each value/group. Each of these is viewed as a separate factor relative to the first group (sorted by alphabet).

Nothing to show

All results of the test

All features were discarded during the filtering.

Data filtration summary

Information about filtration of factors and features during the analysis

Metadata after NAs removement

Metadata after removement of NAs, factors with unique or all distinct values

Metadata is empty after removing NAs, factors with unique or all distinct values

Excluded features

No features were exluded from analysis

Model details

Nothing to show

Alpha-diversity

Linear mixed effect model is applied to find associations of alpha-diversity with each factor from metadata. Normality of the residuals is tested using Shapiro-Wilk test; if p < 0.05 then the results of linear mixed effects model may be unreliable.

Summary

Nothing to show

Model details

Nothing to show

knb_interactive: 2.0.2
knb_lib: 4.8.45
datalab: 3.10.0