ICD withR treatment

Accompaniement of Anaïs Brosse in her analyses

Authors

Affiliation

Olivier Rué

Migale bioinformatics facility

Christelle Hennequet-Antier

Migale bioinformatics facility

Published

July 5, 2023

Modified

February 19, 2025

Show the code

#library(DT)
library(tidyverse)
#library(kableExtra)
library(phyloseq)
library(phyloseq.extended)
library(data.table)
library(InraeThemes)
library(readr) #read_delim
library(ggplot2)      # graphics
library(microbiome) # for boxplot_alpha function
library(mia) # microbiome analysis with SummarizedExperiment and TreeSummarizedExperiment
library(emmeans) #post hoc tests
library(reactable) #table formatting and styling
library(reactablefmtr) #table formatting and styling
library(openxlsx) #save results to excel sheets
library(metacoder) # differential abundance
library(ggtree) # visualization
library(ggtreeExtra) # visualization

Note

This document is a report of the analyses performed. You will find all the code used to analyze these data. The version of the tools (maybe in code chunks) and their references are indicated, for questions of reproducibility.

Aim of the project

Influence of treatment antibiotics in a murine model of infection with Clostridium difficile (ICD). Injection of C. difficile were performed at J00 and J37.

Différencier les groupes ctrl et traités aux antibiotiques (VAN, MTZ, FDX, R = réinfections multiples), et comparer les cinétiques entre les groupes.
Etudier les cinétiques au sein des traitements (CTRL, VAN, MTZ, FDX, R). Est-ce qu’il y a un retour à J-6 ? Clairance de la bactérie (zéro dans les fécès, attention biofilm et sporulation peuvent encore existés)
Eubiose : comparer J-6, J28, J51.

Partners

Olivier Rué - Migale bioinformatics facility - BioInfomics - INRAE
Christelle Hennequet-Antier - Migale bioinformatics facility - BioInfomics - INRAE
Mahendra Mariadassou - Migale bioinformatics facility - BioInfomics - INRAE
Anaïs Brosse - MICALIS - INRAE

Data management

Important

All data is managed by the migale facility for the duration of the project. Once the project is over, the Migale facility does not keep your data. We will provide you with the raw data and associated metadata that will be deposited on public repositories before the results are used. We can guide you in the submission process. We will then decide which files to keep, knowing that this report will also be provided to you and that the analyses can be replayed if needed.

Analyses

Experiment

Phyloseq object creation

The phyloseq package [1] is a tool to import, store, analyze, and graphically display complex metabarcoding data, especially when there is associated sample data, phylogenetic tree, and/or taxonomic assignment of the OTUs or ASVs. Various customs functions written to enhance the base functions of phyloseq are available in the phyloseq-extended package [2].

Show the code

biomfile <- "../data/Galaxy17-[FROGS_Affiliation_OTU__affiliation_abundance.biom].biom1"
frogs <- import_frogs(biomfile, taxMethod = "blast",
                      treefilename = "../data/Galaxy20-[FROGS_Tree__tree.nwk].nhx")

metadata_allsamples <- read_delim("../data/metadata-allsamples.csv", 
                                  delim = ";", escape_double = FALSE, 
                                  col_types = cols(SAMPLE_ID = col_character(), 
                                                   GROUPE = col_factor(levels = c("CTRL", "VAN", "MTZ", "FDX", "R")), 
                                                   REPLICAT = col_factor(levels = c("1", "2", "3", "4")),
                                                   JOUR = col_factor(levels = c("J-6", "J0", "J2", "J5", "J7", "J14", "J18", "J28", "J31", "J37", "J39", "J51"))),
                                  trim_ws = TRUE) %>% 
  column_to_rownames(var="SAMPLE_ID")
#rename variables into english
metadata_allsamples <- dplyr::rename(metadata_allsamples, DAY = JOUR)
metadata_allsamples <- dplyr::rename(metadata_allsamples, GROUP = GROUPE)
#put new label for the JOUR levels to be ordered
#metadata_allsamples$JOUR <- factor(metadata_allsamples$JOUR, levels = c("J-6", "J0", "J2", "J5", "J7", "J14", "J18", "J28", "J31", "J37", "J39", "J51"), labels = c("J-6", "J00", "J02", "J05", "J07", "J14", "J18", "J28", "J31", "J37", "J39", "J51"))
metadata_allsamples$DAY <- factor(metadata_allsamples$DAY, levels = c("J-6", "J0", "J2", "J5", "J7", "J14", "J18", "J28", "J31", "J37", "J39", "J51"), labels = c("D-6", "D00", "D02", "D05", "D07", "D14", "D18", "D28", "D31", "D37", "D39", "D51"))
  
# EUBIOSE <- J-6
# EUBIOSE <- J28
# EUBIOSE <- J51

# Other maner to include phylogenetic tree
# phy_tree(frogs) <- read_tree("../data/Galaxy20-[FROGS_Tree__tree.nwk].nhx")

# Explore sample infos
sample_data(frogs) <- metadata_allsamples
# add new variables to facilitate visualisation and statistical tests
sample_data(frogs) <- metadata_allsamples %>%
  mutate(Label = paste(GROUP, DAY, SOURIS, sep="_")) %>%
  mutate(GROUPJ = factor(paste(GROUP, DAY, sep="_")))
# change name of samples
sample_names(frogs) <- sample_data(frogs) %>%
  as_tibble() %>%
  dplyr::select(Label) %>% 
  t()

Here, we performed statistical analyses on a subset of samples. The study includes or not R treatment (= réinfections multiples) depending on the value withR.

Show the code

# sample_data(frogs)
# info phyloseq object
# "withR" with R treatment (= réinfections multiples)
# "withoutR" without R treatment (= réinfections multiples)
if (params$my.interest == "withR"){
  frogs <- subset_samples(frogs, DAY %in% c("D-6", "D00", "D02", "D14", "D28", "D37", "D39", "D51") & GROUP %in% c("R", "CTRL"))
}
if (params$my.interest == "withoutR"){
  frogs <- subset_samples(frogs, GROUP != "R")
}
microbiome::summarize_phyloseq(frogs)

[[1]]
[1] "1] Min. number of reads = 32299"

[[2]]
[1] "2] Max. number of reads = 141482"

[[3]]
[1] "3] Total number of reads = 3931320"

[[4]]
[1] "4] Average number of reads = 70202.1428571429"

[[5]]
[1] "5] Median number of reads = 69088"

[[6]]
[1] "7] Sparsity = 0.563265306122449"

[[7]]
[1] "6] Any OTU sum to 1 or less? NO"

[[8]]
[1] "8] Number of singletons = 0"

[[9]]
[1] "9] Percent of OTUs that are singletons \n        (i.e. exactly one read detected across all samples)0"

[[10]]
[1] "10] Number of sample variables are: 7"

[[11]]
[1] "GROUP"    "DAY"      "SOURIS"   "POIDS"    "REPLICAT" "Label"    "GROUPJ"

Show the code

# count number of replicate by 
# - GROUP
# - combination of GROUP and DAY
sample_data(frogs) %>% 
  as_tibble() %>% 
  dplyr::count(GROUP)

# A tibble: 2 × 2
  GROUP     n
  <fct> <int>
1 CTRL     32
2 R        24

Show the code

sample_data(frogs) %>% 
  as_tibble() %>% 
  dplyr::add_count(GROUP, DAY) %>%
  dplyr::select(GROUP, DAY, GROUPJ, n) %>%
  distinct() %>%
  ggplot(., aes(x = GROUPJ, y = n, fill = GROUP)) + 
  geom_bar(stat = "identity") +
  labs(title="Number of replicates", y = "n") +
  theme(axis.title.x = element_blank(),
        axis.text.x = element_text(angle = 90, size = 8, hjust = 1))

Show the code

# frequencies within each DAY for the levels of GROUP
microbiome::plot_frequencies(sample_data(frogs), "DAY", "GROUP")

Rarefaction

Samples from J-6 provided less abundances than the others.

Show the code

phyloseq::plot_bar(frogs, fill="Phylum")

For having the same sampling depth for all samples, we rarefy them before exploring the diversity.

Show the code

frogs_rare <- phyloseq::rarefy_even_depth(frogs, rngseed = 123, replace = TRUE)

All depths are now equal to 32299. The dataset contains 1808744 sequences allocated to 280 taxa. Bacteroidota and Proteobateria are dominants.

Show the code

phyloseq::plot_bar(frogs_rare, x="REPLICAT", fill="Phylum") + facet_grid(GROUP~DAY, scales= "free_x")

Taxonomic composition

The taxonomic composition is studied after rarefaction. Let’s have a look at the Phylum level composition. Fermicutes are also present up to 50% depending on the GROUP and the DAY variables. They disappear at J00, J02 J37, except J37 with R treatment.

After rarefaction
Before rarefaction

Show the code

phyloseq.extended::plot_composition(physeq = frogs_rare, taxaRank1 = "Kingdom", taxaSet1 = "Bacteria", taxaRank2 = "Phylum", numberOfTaxa = 10L) +
  facet_grid(~GROUP, scales = "free_x", space = "free") +
  scale_fill_brewer(palette = "Paired") +
  theme(axis.text.x = element_text(size = 6, hjust = 1)) +
  theme(legend.position="top")

Show the code

phyloseq.extended::plot_composition(physeq = frogs, taxaRank1 = "Kingdom", taxaSet1 = "Bacteria", taxaRank2 = "Phylum", numberOfTaxa = 10L) +
  facet_grid(~GROUP, scales = "free_x", space = "free") +
  scale_fill_brewer(palette = "Paired") +
  theme(axis.text.x = element_text(size = 6, hjust = 1)) +
  theme(legend.position="top")

After rarefaction, let’s have a look at the Phylum and genus levels composition with two types of plot.

We explore composition using all samples.

Show the code

phyloseq.extended::plot_composition(frogs_rare, NULL, NULL, "Phylum") +
  facet_grid(~ GROUP + DAY, scales = "free_x", space = "free_x") + 
  scale_fill_brewer(palette = "Paired") +
  theme(axis.text.x = element_text(size = 6, hjust = 1)) +
  theme(legend.position="top")

Show the code

phyloseq.extended::plot_composition(frogs_rare, NULL, NULL, "Phylum", x = "SOURIS") +
  facet_grid(rows = vars(DAY), cols = vars(GROUP), scales = "free_x", space = "free_x") + 
  scale_fill_brewer(palette = "Paired") +
  theme(axis.text.x = element_text(size = 6, hjust = 1)) +
  theme(legend.position="top")

Show the code

phyloseq.extended::plot_composition(frogs_rare, NULL, NULL, "Family", x = "SOURIS") +
  facet_grid(rows = vars(DAY), cols = vars(GROUP), scales = "free_x", space = "free_x") + 
  scale_fill_brewer(palette = "Paired") +
  theme(axis.text.x = element_text(size = 6, hjust = 1)) +
  theme(legend.position="top")

Show the code

phyloseq.extended::plot_composition(frogs_rare, NULL, NULL, "Genus", x = "SOURIS") +
  facet_grid(rows = vars(DAY), cols = vars(GROUP), scales = "free_x", space = "free_x") + 
  scale_fill_brewer(palette = "Paired") +
  theme(axis.text.x = element_text(size = 6, hjust = 1)) +
  theme(legend.position="top")

Problematic taxa
               taxa  Kingdom       Phylum       Class         Order
Cluster_3 Cluster_3 Bacteria Bacteroidota Bacteroidia Bacteroidales
                  Family   Genus rank
Cluster_3 Muribaculaceae Unknown    1

To explore presence of bacteria at genus levels within phylum:

Show the code

phyloseq.extended::plot_composition(frogs_rare, "Phylum", "Bacteroidota", "Genus", fill = "Genus", x = "SOURIS") +
  facet_grid(rows = vars(DAY), cols = vars(GROUP), scales = "free_x", space = "free_x") + 
  scale_fill_brewer(palette = "Paired") +
  theme(axis.text.x = element_text(size = 6, hjust = 1)) +
  theme(legend.position="top")

Problematic taxa
               taxa  Kingdom       Phylum       Class         Order
Cluster_3 Cluster_3 Bacteria Bacteroidota Bacteroidia Bacteroidales
                  Family   Genus rank
Cluster_3 Muribaculaceae Unknown    1

Show the code

phyloseq.extended::plot_composition(frogs_rare, "Phylum", "Proteobacteria", "Genus", fill = "Genus", x = "SOURIS") +
  facet_grid(rows = vars(DAY), cols = vars(GROUP), scales = "free_x", space = "free_x") + 
  scale_fill_brewer(palette = "Paired") +
  theme(axis.text.x = element_text(size = 6, hjust = 1)) +
  theme(legend.position="top")

Problematic taxa
                 taxa  Kingdom         Phylum               Class
Cluster_51 Cluster_51 Bacteria Proteobacteria Alphaproteobacteria
                      Order  Family   Genus rank
Cluster_51 Rhodospirillales Unknown Unknown    3

Show the code

phyloseq.extended::plot_composition(frogs_rare, "Phylum", "Firmicutes", "Genus", fill = "Genus", x = "SOURIS") +
  facet_grid(rows = vars(DAY), cols = vars(GROUP), scales = "free_x", space = "free_x") + 
  scale_fill_brewer(palette = "Paired") +
  theme(axis.text.x = element_text(size = 6, hjust = 1)) +
  theme(legend.position="top")

Problematic taxa
                 taxa  Kingdom     Phylum      Class                      Order
Cluster_40 Cluster_40 Bacteria Firmicutes Clostridia Clostridia vadinBB60 group
Cluster_69 Cluster_69 Bacteria Firmicutes Clostridia            Oscillospirales
Cluster_29 Cluster_29 Bacteria Firmicutes Clostridia             Lachnospirales
Cluster_85 Cluster_85 Bacteria Firmicutes Clostridia             Lachnospirales
                     Family             Genus rank
Cluster_40          Unknown           Unknown    3
Cluster_69 Oscillospiraceae           Unknown    4
Cluster_29  Lachnospiraceae Multi-affiliation    5
Cluster_85  Lachnospiraceae           Unknown    7

Show the code

phyloseq.extended::plot_composition(frogs_rare, "Phylum", "Actinobacteriota", "Genus", fill = "Genus", x = "SOURIS") +
  facet_grid(rows = vars(DAY), cols = vars(GROUP), scales = "free_x", space = "free_x") + 
  scale_fill_brewer(palette = "Paired") +
  theme(axis.text.x = element_text(size = 6, hjust = 1)) +
  theme(legend.position="top")

Show the code

phyloseq.extended::plot_composition(frogs_rare, "Phylum", "Verrucomicrobiota", "Genus", fill = "Genus", x = "SOURIS") +
  facet_grid(rows = vars(DAY), cols = vars(GROUP), scales = "free_x", space = "free_x") + 
  scale_fill_brewer(palette = "Paired") +
  theme(axis.text.x = element_text(size = 6, hjust = 1)) +
  theme(legend.position="top")

Important

This comment provides from the complete study with all samples.

We observed that the taxonomic composition is mainly determined by DAY rather than GROUP. Protebacteria are dominant in J00, J02, J18 (GROUP R) and J37. Firmicutes are present at J-6 (a little), J05 (CTRL and VAN), J07 (CTRL, VAN, MTZ), J14, J28, J39, J51. At the genus level, similar taxonomic patterns were found in J00 with J37 and in J-6 with J51.

Alpha-Diversity

Boxplot of alpha-diversity facetting by GROUP.

Show the code

# plot alpha diversity
# measures = c("Observed", "Chao1", "Shannon", "Simpson", "InvSimpson")
p <- phyloseq::plot_richness(physeq = frogs_rare, x = "GROUP", color= "DAY", title = "Alpha diversity graphics", measures = c("Observed", "Shannon"), nrow=5) + 
  geom_jitter() +
  scale_fill_brewer(palette = "Paired") +
  theme(axis.text.x = element_text(size = 6, angle = 45, hjust = 1)) +
  theme(legend.position="top")
p

Show the code

# plot alpha diversity 
p.alphadiversity <- microbiome::boxplot_alpha(frogs_rare, 
                           index = "Observed",
                           x_var = "DAY")
p.alphadiversity +
    facet_grid(cols = vars(GROUP), scales = "free_x", space = "free_x") + 
    scale_fill_brewer(palette = "Paired") +
    theme(axis.text.x = element_text(size = 6, angle = 45, hjust = 1)) +
    theme(legend.position="top")

Show the code

# plot alpha diversity 
p.alphadiversity <- microbiome::boxplot_alpha(frogs_rare, 
                           index = "Shannon",
                           x_var = "DAY")
p.alphadiversity +
    facet_grid(cols = vars(GROUP), scales = "free_x", space = "free_x") + 
    scale_fill_brewer(palette = "Paired") +
    theme(axis.text.x = element_text(size = 6, angle = 45, hjust = 1)) +
    theme(legend.position="top")

Boxplot of alpha-diversity facetting by DAY.

Show the code

# plot alpha diversity
# measures = c("Observed", "Chao1", "Shannon", "Simpson", "InvSimpson")
p <- phyloseq::plot_richness(physeq = frogs_rare, x = "DAY", color= "GROUP", title = "Alpha diversity graphics", measures = c("Observed", "Shannon"), nrow=5) + 
  geom_jitter() +
  scale_fill_brewer(palette = "Paired") +
  theme(axis.text.x = element_text(size = 6, angle = 45, hjust = 1)) +
  theme(legend.position="top")
p

Show the code

# plot alpha diversity 
p.alphadiversity <- microbiome::boxplot_alpha(frogs_rare, 
                           index = "Observed",
                           x_var = "GROUP")
p.alphadiversity +
    facet_grid(cols = vars(DAY), scales = "free_x", space = "free_x") + 
    scale_fill_brewer(palette = "Paired") +
    theme(axis.text.x = element_text(size = 6, angle = 45, hjust = 1)) +
    theme(legend.position="top")

Show the code

# plot alpha diversity 
p.alphadiversity <- microbiome::boxplot_alpha(frogs_rare, 
                           index = "Shannon",
                           x_var = "GROUP")
p.alphadiversity +
    facet_grid(cols = vars(DAY), scales = "free_x", space = "free_x") + 
    scale_fill_brewer(palette = "Paired") +
    theme(axis.text.x = element_text(size = 6, angle = 45, hjust = 1)) +
    theme(legend.position="top")

Important

We observed that the alpha-diversity is mainly determined by DAY rather than GROUP. For each sample, the observed index measures the richness in the sample i.e. the total number of species in the community. Richness was highest in samples from J-6 and J51, and also J28 with more variability. It decreased on J00, J18 (GROUP R), and J37 on days of infection, and increased between infection dates. The other indices do not provide any additional information, as the trajectories are consistent with infection dates.

Alpha-diversity - statistical tests

We calculated the alpha-diversity indices and combined them with covariates from experimental design.

Show the code

div_data <- cbind(estimate_richness(frogs_rare),  ## diversity indices
                  sample_data(frogs_rare)         ## covariates
                  )

Model with interaction

Show the code

#model <- aov(Observed ~ DAY*GROUP, data = div_data)
#anova(model)
#coef(model)
# produce parwise comparison test with Tukey's post-hoc correction 
#TukeyHSD(model, which ="DAY:GROUP")

alphadiv.lm <- lm(Observed ~ DAY*GROUP, data = div_data)
anova(alphadiv.lm)

Analysis of Variance Table

Response: Observed
          Df Sum Sq Mean Sq  F value    Pr(>F)    
DAY        7 188276 26896.5 101.9444 < 2.2e-16 ***
GROUP      1   1307  1306.5   4.9520  0.031764 *  
DAY:GROUP  7   6818   974.0   3.6916  0.003615 ** 
Residuals 40  10553   263.8                       
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Show the code

# interaction visualisation
par(mfrow = c(1,2))
emmip(alphadiv.lm, GROUP ~ DAY, CIs = TRUE)

Show the code

emmip(alphadiv.lm, DAY ~ GROUP, CIs = TRUE)

Show the code

par(mfrow = c(1,1))
#tests 
EMM <- emmeans(alphadiv.lm, ~ DAY*GROUP)
# tests correction are performed within the variable
# P value adjustment: tukey method for comparing  a family of 12 estimates
comp1 <- pairs(EMM, simple = "DAY")
# P value adjustment: tukey method for comparing a family of 5 estimates 
comp2 <- pairs(EMM, simple = "GROUP")

Show the code

# P value adjustment: tukey method for comparing a family of 60 estimates 
res_emm <- contrast(EMM, method = "revpairwise", by = NULL, 
    enhance.levels = c("DAY", "GROUP"), adjust = "tukey")

Important

Richness is significantly different by DAY, GROUP and also by the interaction DAY:GROUP. Pairwise tests between DAY, GROUP and also by the interaction DAY:GROUPwere performed with post hoc Tukey’s test to determine which were significant.

Beta-diversity

The dissimilarity between samples based on taxonomic composition is calculated with one of these Beta diversities: Bray-Curtis, Unweighted and weighted Unifrac, Jaccard index. The Jaccard index uses presence/absence information only whereas UniFrac integrates information from the phylogenetic tree.

Show the code

dist.jac <- phyloseq::distance(frogs_rare, method = "cc")
dist.bc <- phyloseq::distance(frogs_rare, method = "bray")
dist.uf <- phyloseq::distance(frogs_rare, method = "unifrac")
dist.wuf <- phyloseq::distance(frogs_rare, method = "wunifrac")
distance_list <- list(
  "Jaccard"     = dist.jac,
  "Bray-Curtis" = dist.bc,
  "UniFrac"     = dist.uf,
  "wUniFrac"    = dist.wuf
)

Show the code

# Mahendra's function
my_plot <- function(variable1, variable2, physeq = mydata_rare) {
  .plot <- function(distance) {
    .distance <- distance_list[[distance]] %>% as.matrix() %>% `[`(sample_names(physeq), sample_names(physeq)) %>% as.dist()
    p <- plot_ordination(physeq,
                    ordinate(physeq, method = "MDS", distance = .distance),
                    color = variable1, shape = variable2) + 
      scale_color_brewer(palette = "Paired") +
      theme(legend.position = "none") +
      ggtitle(distance)
    # print(p)
  }
  plots <- lapply(names(distance_list), .plot)
  legend <- cowplot::get_legend(plots[[1]] + theme(legend.position = "bottom"))
  pgrid <- cowplot::plot_grid(plotlist = plots)
  cowplot::plot_grid(pgrid, legend, ncol = 1, rel_heights = c(1, .1))
}

Show the code

my_plot(variable1 = "DAY", variable2 = "GROUP", physeq = frogs_rare)

Important

The Multidimensional scaling (MDS / PCoA) showed that samples were distributed according to the variable DAY on the first two axes. Samples from J-6, J28 and J51 are homogeneous, close to each other and opposite to samples from J00 on the first axis. The others are distributed in both dimensions.

Beta-diversity - Tests

Show the code

metadata <- sample_data(frogs_rare) %>% as("data.frame")
model <- vegan::adonis2(dist.bc ~ DAY*GROUP, data = metadata, permutations = 9999)
print(model)

Permutation test for adonis under reduced model
Permutation: free
Number of permutations: 9999

vegan::adonis2(formula = dist.bc ~ DAY * GROUP, data = metadata, permutations = 9999)
         Df SumOfSqs      R2     F Pr(>F)    
Model    15  12.8607 0.83786 13.78  1e-04 ***
Residual 40   2.4887 0.16214                 
Total    55  15.3495 1.00000                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Show the code

# test with dist.jac, dist.uf, dist.wuf
# model <- vegan::adonis2(dist.jac ~ DAY*GROUP, data = metadata, permutations = 9999)
# print(model)

Important

The change in microbiota composition measured by the Bray-Curtis Beta diversity is significantly different between DAY, GROUP and also between the levels of the interaction factor DAY:GROUP at 0.05 significance level. The influence of the factors differs between ecosystems.

Show the code

par(mar = c(3, 0, 2, 0), cex = 0.8)
phyloseq.extended::plot_clust(frogs_rare, dist = "bray", method = "ward.D2", color = "DAY",
           title = "Clustering of samples (Bray-Curtis + Ward)")

Important

Hierarchical clustering based on the Bray-Curtis dissimilarity and the Ward aggregation criterion shows that sample composition is not entirely structured by DAY. Groups were formed according to the date of infection and the existing DAY:GROUP interaction .

At Family and Genus taxonomic levels, we created heatmap plot using ordination methods to organize the rows.

Family
Genus

Show the code

#  method = NULL if not ordination
p <- phyloseq::plot_heatmap(tax_glom(frogs_rare, taxrank="Family"), 
                  method = "NMDS",
                  distance = "bray",
                  sample.order= "Label", # not ordination
                 # taxa.order = "Family",
                  taxa.label = "Family") + 
  facet_grid(~ DAY, scales = "free_x", space = "free_x") + 
  scale_fill_gradient2(low = "#1a9850", mid = "#ffffbf", high = "#d73027",
                       na.value = "white", trans = scales::log_trans(4),
                       midpoint = log(100, base = 4))
p

Show the code

p <- phyloseq::plot_heatmap(tax_glom(frogs_rare, taxrank="Genus"), 
                  distance = "bray", 
                  method = "NMDS",
                  sample.order= "Label", # not ordination
                 # taxa.order = "Family",
                  taxa.label="Genus") + 
  facet_grid(~ DAY, scales = "free_x", space = "free_x") + 
  scale_fill_gradient2(low = "#1a9850", mid = "#ffffbf", high = "#d73027",
                       na.value = "white", trans = scales::log_trans(4),
                       midpoint = log(100, base = 4))
p

Important

This comment provides from the complete study with all samples. As expected many bacteria disappeared at J00 due to antiobitics. Clostridoides was absent at J-6, J00, J28 and J51. It was present at J02 and J37, and partially present during the evolution of the infection. Abundance profiles appeared fairly similar similar at J-6, J00, J28 and J51. We observed differences for Bifidobacterium, 28-4, Oscillibacter, and Prevotellaceae UCG-001.

Reproducibility token

Show the code

sessioninfo::session_info(pkgs = "attached")

─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.4.2 (2024-10-31)
 os       Ubuntu 24.04.1 LTS
 system   x86_64, linux-gnu
 ui       X11
 language (EN)
 collate  fr_FR.UTF-8
 ctype    fr_FR.UTF-8
 tz       Europe/Paris
 date     2025-02-18
 pandoc   3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)

─ Packages ───────────────────────────────────────────────────────────────────
 package                  * version date (UTC) lib source
 Biobase                  * 2.64.0  2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 BiocGenerics             * 0.50.0  2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 Biostrings               * 2.72.1  2024-06-02 [1] Bioconductor 3.19 (R 4.4.0)
 data.table               * 1.16.4  2024-12-06 [1] CRAN (R 4.4.2)
 dplyr                    * 1.1.4   2023-11-17 [1] CRAN (R 4.4.0)
 emmeans                  * 1.10.7  2025-01-31 [1] CRAN (R 4.4.2)
 forcats                  * 1.0.0   2023-01-29 [1] CRAN (R 4.4.0)
 GenomeInfoDb             * 1.40.1  2024-05-24 [1] Bioconductor 3.19 (R 4.4.0)
 GenomicRanges            * 1.56.2  2024-10-09 [1] Bioconductor 3.19 (R 4.4.1)
 ggplot2                  * 3.5.1   2024-04-23 [1] CRAN (R 4.4.0)
 ggtree                   * 3.12.0  2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 ggtreeExtra              * 1.14.0  2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 InraeThemes              * 3.0.0   2024-05-21 [1] Github (davidcarayon/InraeThemes@9ee5259)
 IRanges                  * 2.38.1  2024-07-03 [1] Bioconductor 3.19 (R 4.4.1)
 lubridate                * 1.9.4   2024-12-08 [1] CRAN (R 4.4.2)
 MatrixGenerics           * 1.16.0  2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 matrixStats              * 1.5.0   2025-01-07 [1] CRAN (R 4.4.2)
 metacoder                * 0.3.7   2024-10-03 [1] Github (grunwaldlab/metacoder@54ef8ea)
 mia                      * 1.15.6  2025-01-07 [1] Github (microbiome/mia@2dc3430)
 microbiome               * 1.26.0  2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 MultiAssayExperiment     * 1.30.3  2024-07-10 [1] Bioconductor 3.19 (R 4.4.2)
 openxlsx                 * 4.2.8   2025-01-25 [1] CRAN (R 4.4.2)
 phyloseq                 * 1.48.0  2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 phyloseq.extended        * 0.1.4.1 2024-05-21 [1] Github (mahendra-mariadassou/phyloseq-extended@150a871)
 purrr                    * 1.0.2   2023-08-10 [1] CRAN (R 4.4.0)
 reactable                * 0.4.4   2023-03-12 [1] CRAN (R 4.4.0)
 reactablefmtr            * 2.0.0   2022-03-16 [1] CRAN (R 4.4.0)
 readr                    * 2.1.5   2024-01-10 [1] CRAN (R 4.4.0)
 S4Vectors                * 0.42.1  2024-07-03 [1] Bioconductor 3.19 (R 4.4.1)
 SingleCellExperiment     * 1.26.0  2024-04-30 [1] Bioconductor 3.19 (R 4.4.2)
 stringr                  * 1.5.1   2023-11-14 [1] CRAN (R 4.4.0)
 SummarizedExperiment     * 1.34.0  2024-05-01 [1] Bioconductor 3.19 (R 4.4.0)
 tibble                   * 3.2.1   2023-03-20 [1] CRAN (R 4.4.0)
 tidyr                    * 1.3.1   2024-01-24 [1] CRAN (R 4.4.0)
 tidyverse                * 2.0.0   2023-02-22 [1] CRAN (R 4.4.0)
 TreeSummarizedExperiment * 2.12.0  2024-04-30 [1] Bioconductor 3.19 (R 4.4.2)
 XVector                  * 0.44.0  2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)

 [1] /home/orue/R/x86_64-pc-linux-gnu-library/4.4
 [2] /usr/local/lib/R/site-library
 [3] /usr/lib/R/site-library
 [4] /usr/lib/R/library

──────────────────────────────────────────────────────────────────────────────

References

1. McMurdie PJ, Holmes S. Phyloseq: An r package for reproducible interactive analysis and graphics of microbiome census data. PloS one. 2013;8:e61217.

2. Mariadassou M. Phyloseq-extended: Various customs functions written to enhance the base functions of phyloseq. 2018. https://github.com/mahendra-mariadassou/phyloseq-extended.

Reuse

This document will not be accessible without prior agreement of the partners

A work by Migale Bioinformatics Facility
Université Paris-Saclay, INRAE, MaIAGE, 78350, Jouy-en-Josas, France
Université Paris-Saclay, INRAE, BioinfOmics, MIGALE bioinformatics facility, 78350, Jouy-en-Josas, France

--- params: my.interest: "withR" title: "ICD `r params$my.interest` treatment" search: false navbar: false subtitle: "Accompaniement of Anaïs Brosse in her analyses" author: - name: Olivier Rué orcid: 0000-0003-1689-0557 email: olivier.rue@inrae.fr affiliations: - name: Migale bioinformatics facility adress: Domaine de Vilvert city: Jouy-en-Josas state: France - name: Christelle Hennequet-Antier orcid: 0000-0001-5836-2803 email: christelle.hennequet-antier@inrae.fr affiliations: - name: Migale bioinformatics facility adress: Domaine de Vilvert city: Jouy-en-Josas state: France date: "2023-07-05" date-modified: today image: "images/preview.jpg" reader-mode: false fig-cap-location: bottom tbl-cap-location: top license: "This document will not be accessible without prior agreement of the partners" format: html: embed-resources: false toc: true code-fold: true code-summary: "Show the code" code-tools: true toc-location: right page-layout: article code-overflow: wrap --- ```{r setup, include=FALSE} knitr::opts_chunk$set(eval=TRUE, echo =TRUE, cache = FALSE, message = FALSE, warning = FALSE, cache.lazy = FALSE, fig.height = 3.5, fig.width = 10) ``` ```{r} #| label: load-packages #library(DT) library(tidyverse) #library(kableExtra) library(phyloseq) library(phyloseq.extended) library(data.table) library(InraeThemes) library(readr) #read_delim library(ggplot2) # graphics library(microbiome) # for boxplot_alpha function library(mia) # microbiome analysis with SummarizedExperiment and TreeSummarizedExperiment library(emmeans) #post hoc tests library(reactable) #table formatting and styling library(reactablefmtr) #table formatting and styling library(openxlsx) #save results to excel sheets library(metacoder) # differential abundance library(ggtree) # visualization library(ggtreeExtra) # visualization ```        {{< include /resources/_includes/_intro_note.qmd >}} # Aim of the project Influence of treatment antibiotics in a murine model of infection with *Clostridium difficile* (ICD). Injection of *C. difficile* were performed at `J00` and `J37`. - Différencier les groupes ctrl et traités aux antibiotiques (VAN, MTZ, FDX, R = réinfections multiples), et comparer les cinétiques entre les groupes. - Etudier les cinétiques au sein des traitements (CTRL, VAN, MTZ, FDX, R). Est-ce qu’il y a un retour à J-6 ? Clairance de la bactérie (zéro dans les fécès, attention biofilm et sporulation peuvent encore existés) - Eubiose : comparer J-6, J28, J51. ## Partners * Olivier Rué - Migale bioinformatics facility - BioInfomics - INRAE * Christelle Hennequet-Antier - Migale bioinformatics facility - BioInfomics - INRAE * Mahendra Mariadassou - Migale bioinformatics facility - BioInfomics - INRAE * Anaïs Brosse - MICALIS - INRAE # Data management {{< include /resources/_includes/_data_management.qmd >}} # Analyses ## Experiment ![Fig1: Experimental protocol](../data/Design_protocole.png){fig-align="center"} ## Phyloseq object creation {{< include /resources/_includes/_phyloseq.qmd >}} ```{r} #| label: phyloseqOject biomfile <- "../data/Galaxy17-[FROGS_Affiliation_OTU__affiliation_abundance.biom].biom1" frogs <- import_frogs(biomfile, taxMethod = "blast", treefilename = "../data/Galaxy20-[FROGS_Tree__tree.nwk].nhx") metadata_allsamples <- read_delim("../data/metadata-allsamples.csv", delim = ";", escape_double = FALSE, col_types = cols(SAMPLE_ID = col_character(), GROUPE = col_factor(levels = c("CTRL", "VAN", "MTZ", "FDX", "R")), REPLICAT = col_factor(levels = c("1", "2", "3", "4")), JOUR = col_factor(levels = c("J-6", "J0", "J2", "J5", "J7", "J14", "J18", "J28", "J31", "J37", "J39", "J51"))), trim_ws = TRUE) %>% column_to_rownames(var="SAMPLE_ID") #rename variables into english metadata_allsamples <- dplyr::rename(metadata_allsamples, DAY = JOUR) metadata_allsamples <- dplyr::rename(metadata_allsamples, GROUP = GROUPE) #put new label for the JOUR levels to be ordered #metadata_allsamples$JOUR <- factor(metadata_allsamples$JOUR, levels = c("J-6", "J0", "J2", "J5", "J7", "J14", "J18", "J28", "J31", "J37", "J39", "J51"), labels = c("J-6", "J00", "J02", "J05", "J07", "J14", "J18", "J28", "J31", "J37", "J39", "J51")) metadata_allsamples$DAY <- factor(metadata_allsamples$DAY, levels = c("J-6", "J0", "J2", "J5", "J7", "J14", "J18", "J28", "J31", "J37", "J39", "J51"), labels = c("D-6", "D00", "D02", "D05", "D07", "D14", "D18", "D28", "D31", "D37", "D39", "D51")) # EUBIOSE <- J-6 # EUBIOSE <- J28 # EUBIOSE <- J51 # Other maner to include phylogenetic tree # phy_tree(frogs) <- read_tree("../data/Galaxy20-[FROGS_Tree__tree.nwk].nhx") # Explore sample infos sample_data(frogs) <- metadata_allsamples # add new variables to facilitate visualisation and statistical tests sample_data(frogs) <- metadata_allsamples %>% mutate(Label = paste(GROUP, DAY, SOURIS, sep="_")) %>% mutate(GROUPJ = factor(paste(GROUP, DAY, sep="_"))) # change name of samples sample_names(frogs) <- sample_data(frogs) %>% as_tibble() %>% dplyr::select(Label) %>% t() ``` Here, we performed statistical analyses on a subset of samples. The study includes or not R treatment (= réinfections multiples) depending on the value `r params$my.interest`. ```{r} #| label: subset # sample_data(frogs) # info phyloseq object # "withR" with R treatment (= réinfections multiples) # "withoutR" without R treatment (= réinfections multiples) if (params$my.interest == "withR"){ frogs <- subset_samples(frogs, DAY %in% c("D-6", "D00", "D02", "D14", "D28", "D37", "D39", "D51") & GROUP %in% c("R", "CTRL")) } if (params$my.interest == "withoutR"){ frogs <- subset_samples(frogs, GROUP != "R") } microbiome::summarize_phyloseq(frogs) ``` ```{r} #| label: checkdesign # count number of replicate by # - GROUP # - combination of GROUP and DAY sample_data(frogs) %>% as_tibble() %>% dplyr::count(GROUP) sample_data(frogs) %>% as_tibble() %>% dplyr::add_count(GROUP, DAY) %>% dplyr::select(GROUP, DAY, GROUPJ, n) %>% distinct() %>% ggplot(., aes(x = GROUPJ, y = n, fill = GROUP)) + geom_bar(stat = "identity") + labs(title="Number of replicates", y = "n") + theme(axis.title.x = element_blank(), axis.text.x = element_text(angle = 90, size = 8, hjust = 1)) # frequencies within each DAY for the levels of GROUP microbiome::plot_frequencies(sample_data(frogs), "DAY", "GROUP") ``` ## Rarefaction Samples from `J-6` provided less abundances than the others. ```{r} #| label: plot_bar phyloseq::plot_bar(frogs, fill="Phylum") ``` For having the same sampling depth for all samples, we rarefy them before exploring the diversity. ```{r} #| label: frogs_rare frogs_rare <- phyloseq::rarefy_even_depth(frogs, rngseed = 123, replace = TRUE) ``` All depths are now equal to `r as.integer(min(phyloseq::sample_sums(frogs)))`. The dataset contains `r as.integer(sum(phyloseq::sample_sums(frogs_rare)))` sequences allocated to `r as.integer(phyloseq::ntaxa(frogs_rare))` taxa. Bacteroidota and Proteobateria are dominants. ```{r} #| label: plot_bar_rare phyloseq::plot_bar(frogs_rare, x="REPLICAT", fill="Phylum") + facet_grid(GROUP~DAY, scales= "free_x") ``` ## Taxonomic composition The taxonomic composition is studied after rarefaction. Let's have a look at the Phylum level composition. Fermicutes are also present up to 50% depending on the `GROUP` and the `DAY` variables. They disappear at J00, J02 J37, except J37 with `R` treatment. ::: {.panel-tabset} ### After rarefaction ```{r} #| label: compos_rare phyloseq.extended::plot_composition(physeq = frogs_rare, taxaRank1 = "Kingdom", taxaSet1 = "Bacteria", taxaRank2 = "Phylum", numberOfTaxa = 10L) + facet_grid(~GROUP, scales = "free_x", space = "free") + scale_fill_brewer(palette = "Paired") + theme(axis.text.x = element_text(size = 6, hjust = 1)) + theme(legend.position="top") ``` ### Before rarefaction ```{r} #| label: compos_rawdata phyloseq.extended::plot_composition(physeq = frogs, taxaRank1 = "Kingdom", taxaSet1 = "Bacteria", taxaRank2 = "Phylum", numberOfTaxa = 10L) + facet_grid(~GROUP, scales = "free_x", space = "free") + scale_fill_brewer(palette = "Paired") + theme(axis.text.x = element_text(size = 6, hjust = 1)) + theme(legend.position="top") ``` ::: After rarefaction, let's have a look at the Phylum and genus levels composition with two types of plot. We explore composition using all samples. ::: {.panel-tabset} ### All By GROUP ```{r} #| label: All_compos_rare_by_GROUP #| fig-width: 20 #| class: superbigimage phyloseq.extended::plot_composition(frogs_rare, NULL, NULL, "Phylum") + facet_grid(~ GROUP + DAY, scales = "free_x", space = "free_x") + scale_fill_brewer(palette = "Paired") + theme(axis.text.x = element_text(size = 6, hjust = 1)) + theme(legend.position="top") ``` ### All By grid at phylum level ```{r} #| label: All_compos_rare_by_GROUP_DAY #| fig-height: 10 phyloseq.extended::plot_composition(frogs_rare, NULL, NULL, "Phylum", x = "SOURIS") + facet_grid(rows = vars(DAY), cols = vars(GROUP), scales = "free_x", space = "free_x") + scale_fill_brewer(palette = "Paired") + theme(axis.text.x = element_text(size = 6, hjust = 1)) + theme(legend.position="top") ``` ### All By grid at family level ```{r} #| label: All_compos_rare_by_GROUP_DAY_family #| fig-height: 10 phyloseq.extended::plot_composition(frogs_rare, NULL, NULL, "Family", x = "SOURIS") + facet_grid(rows = vars(DAY), cols = vars(GROUP), scales = "free_x", space = "free_x") + scale_fill_brewer(palette = "Paired") + theme(axis.text.x = element_text(size = 6, hjust = 1)) + theme(legend.position="top") ``` ### All By grid at genus level ```{r} #| label: All_compos_rare_by_GROUP_DAY_genus #| fig-height: 10 phyloseq.extended::plot_composition(frogs_rare, NULL, NULL, "Genus", x = "SOURIS") + facet_grid(rows = vars(DAY), cols = vars(GROUP), scales = "free_x", space = "free_x") + scale_fill_brewer(palette = "Paired") + theme(axis.text.x = element_text(size = 6, hjust = 1)) + theme(legend.position="top") ``` ::: To explore presence of bacteria at genus levels within phylum: ::: {.panel-tabset} ### Bacteroidota ```{r} #| label: All_compos_rare_Bacteroidota #| fig-height: 10 phyloseq.extended::plot_composition(frogs_rare, "Phylum", "Bacteroidota", "Genus", fill = "Genus", x = "SOURIS") + facet_grid(rows = vars(DAY), cols = vars(GROUP), scales = "free_x", space = "free_x") + scale_fill_brewer(palette = "Paired") + theme(axis.text.x = element_text(size = 6, hjust = 1)) + theme(legend.position="top") ``` ### Proteobacteria ```{r} #| label: All_compos_rare_Proteobacteria #| fig-height: 10 phyloseq.extended::plot_composition(frogs_rare, "Phylum", "Proteobacteria", "Genus", fill = "Genus", x = "SOURIS") + facet_grid(rows = vars(DAY), cols = vars(GROUP), scales = "free_x", space = "free_x") + scale_fill_brewer(palette = "Paired") + theme(axis.text.x = element_text(size = 6, hjust = 1)) + theme(legend.position="top") ``` ### Firmicutes ```{r} #| label: All_compos_rare_Firmicutes #| fig-height: 10 phyloseq.extended::plot_composition(frogs_rare, "Phylum", "Firmicutes", "Genus", fill = "Genus", x = "SOURIS") + facet_grid(rows = vars(DAY), cols = vars(GROUP), scales = "free_x", space = "free_x") + scale_fill_brewer(palette = "Paired") + theme(axis.text.x = element_text(size = 6, hjust = 1)) + theme(legend.position="top") ``` ### Actinobacteriota ```{r} #| label: All_compos_rare_Actinobacteriota #| fig-height: 10 phyloseq.extended::plot_composition(frogs_rare, "Phylum", "Actinobacteriota", "Genus", fill = "Genus", x = "SOURIS") + facet_grid(rows = vars(DAY), cols = vars(GROUP), scales = "free_x", space = "free_x") + scale_fill_brewer(palette = "Paired") + theme(axis.text.x = element_text(size = 6, hjust = 1)) + theme(legend.position="top") ``` ### Verrucomicrobiota ```{r} #| label: All_compos_rare_Verrucomicrobiota #| fig-height: 10 phyloseq.extended::plot_composition(frogs_rare, "Phylum", "Verrucomicrobiota", "Genus", fill = "Genus", x = "SOURIS") + facet_grid(rows = vars(DAY), cols = vars(GROUP), scales = "free_x", space = "free_x") + scale_fill_brewer(palette = "Paired") + theme(axis.text.x = element_text(size = 6, hjust = 1)) + theme(legend.position="top") ``` ::: ::: {.callout-important} This comment provides from the complete study with all samples. We observed that the taxonomic composition is mainly determined by `DAY` rather than `GROUP`. Protebacteria are dominant in `J00`, `J02`, `J18` (`GROUP R`) and `J37`. Firmicutes are present at `J-6` (a little), `J05` (CTRL and VAN), `J07` (CTRL, VAN, MTZ), `J14`, `J28`, `J39`, `J51`. At the genus level, similar taxonomic patterns were found in `J00` with `J37` and in `J-6` with `J51`. ::: ## Alpha-Diversity Boxplot of alpha-diversity facetting by `GROUP`. ::: {.panel-tabset} ### All ```{r} #| label: alpha_diversity_all1 #| fig-height: 10 # plot alpha diversity # measures = c("Observed", "Chao1", "Shannon", "Simpson", "InvSimpson") p <- phyloseq::plot_richness(physeq = frogs_rare, x = "GROUP", color= "DAY", title = "Alpha diversity graphics", measures = c("Observed", "Shannon"), nrow=5) + geom_jitter() + scale_fill_brewer(palette = "Paired") + theme(axis.text.x = element_text(size = 6, angle = 45, hjust = 1)) + theme(legend.position="top") p ``` ### Observed ```{r} #| label: alpha_diversity_obs1 #| fig-height: 10 # plot alpha diversity p.alphadiversity <- microbiome::boxplot_alpha(frogs_rare, index = "Observed", x_var = "DAY") p.alphadiversity + facet_grid(cols = vars(GROUP), scales = "free_x", space = "free_x") + scale_fill_brewer(palette = "Paired") + theme(axis.text.x = element_text(size = 6, angle = 45, hjust = 1)) + theme(legend.position="top") ```               ### Shannon ```{r} #| label: alpha_diversity_shan1 #| fig-height: 10 # plot alpha diversity p.alphadiversity <- microbiome::boxplot_alpha(frogs_rare, index = "Shannon", x_var = "DAY") p.alphadiversity + facet_grid(cols = vars(GROUP), scales = "free_x", space = "free_x") + scale_fill_brewer(palette = "Paired") + theme(axis.text.x = element_text(size = 6, angle = 45, hjust = 1)) + theme(legend.position="top") ```               ::: Boxplot of alpha-diversity facetting by `DAY`. ::: {.panel-tabset} ### All ```{r} #| label: alpha_diversity_all2 #| fig-height: 10 # plot alpha diversity # measures = c("Observed", "Chao1", "Shannon", "Simpson", "InvSimpson") p <- phyloseq::plot_richness(physeq = frogs_rare, x = "DAY", color= "GROUP", title = "Alpha diversity graphics", measures = c("Observed", "Shannon"), nrow=5) + geom_jitter() + scale_fill_brewer(palette = "Paired") + theme(axis.text.x = element_text(size = 6, angle = 45, hjust = 1)) + theme(legend.position="top") p ``` ### Observed ```{r} #| label: alpha_diversity_obs2 #| fig-height: 10 # plot alpha diversity p.alphadiversity <- microbiome::boxplot_alpha(frogs_rare, index = "Observed", x_var = "GROUP") p.alphadiversity + facet_grid(cols = vars(DAY), scales = "free_x", space = "free_x") + scale_fill_brewer(palette = "Paired") + theme(axis.text.x = element_text(size = 6, angle = 45, hjust = 1)) + theme(legend.position="top") ```               ### Shannon ```{r} #| label: alpha_diversity_shan2 #| fig-height: 10 # plot alpha diversity p.alphadiversity <- microbiome::boxplot_alpha(frogs_rare, index = "Shannon", x_var = "GROUP") p.alphadiversity + facet_grid(cols = vars(DAY), scales = "free_x", space = "free_x") + scale_fill_brewer(palette = "Paired") + theme(axis.text.x = element_text(size = 6, angle = 45, hjust = 1)) + theme(legend.position="top") ```               ::: ::: {.callout-important} We observed that the alpha-diversity is mainly determined by `DAY` rather than `GROUP`. For each sample, the observed index measures the richness in the sample i.e. the total number of species in the community. Richness was highest in samples from `J-6` and `J51`, and also `J28` with more variability. It decreased on `J00`, `J18 (GROUP R)`, and `J37` on days of infection, and increased between infection dates. The other indices do not provide any additional information, as the trajectories are consistent with infection dates. ::: ## Alpha-diversity - statistical tests We calculated the alpha-diversity indices and combined them with covariates from experimental design. ```{r} #| label: alpha_div_covariates div_data <- cbind(estimate_richness(frogs_rare), ## diversity indices sample_data(frogs_rare) ## covariates ) ``` ### Model with interaction ```{r} #| label: posthoc_alpha_div #model <- aov(Observed ~ DAY*GROUP, data = div_data) #anova(model) #coef(model) # produce parwise comparison test with Tukey's post-hoc correction #TukeyHSD(model, which ="DAY:GROUP") alphadiv.lm <- lm(Observed ~ DAY*GROUP, data = div_data) anova(alphadiv.lm) # interaction visualisation par(mfrow = c(1,2)) emmip(alphadiv.lm, GROUP ~ DAY, CIs = TRUE) emmip(alphadiv.lm, DAY ~ GROUP, CIs = TRUE) par(mfrow = c(1,1)) #tests EMM <- emmeans(alphadiv.lm, ~ DAY*GROUP) # tests correction are performed within the variable # P value adjustment: tukey method for comparing a family of 12 estimates comp1 <- pairs(EMM, simple = "DAY") # P value adjustment: tukey method for comparing a family of 5 estimates comp2 <- pairs(EMM, simple = "GROUP") ``` ```{r} #| label: dataframe_allcontrast # P value adjustment: tukey method for comparing a family of 60 estimates res_emm <- contrast(EMM, method = "revpairwise", by = NULL, enhance.levels = c("DAY", "GROUP"), adjust = "tukey") ``` ```{r} #| label: save_allcontrast #| echo: false # using openxlsx package to export results excelsheets <- list(All = res_emm, compbyGROUP = comp1, compbyDAY = comp2) # write.xlsx(excelsheets, # file = paste0("./html/", "comparisons_alpha_diversity", ".xlsx")) write.xlsx(excelsheets, file = paste0("./comparisons_alpha_diversity_", params$my.interest, ".xlsx")) ``` ::: {.callout-important} Richness is significantly different by `DAY`, `GROUP` and also by the interaction `DAY:GROUP`. Pairwise tests between `DAY`, `GROUP` and also by the interaction `DAY:GROUP`were performed with post hoc Tukey's test to determine which were significant. ::: ```{r} #| label: reactable_comp1 #| eval: true #| echo: false # https://glin.github.io/reactable/index.html # https://glin.github.io/reactable/articles/custom-filtering.html#table-search-methods # TO DO # - filter min p.value # - search contrast # color value pvalue on significance # remplir en gris lorsque p.value tukey est non signif # save # table from results of contrast # visualize estimate and p.value comp_table <- read.xlsx(xlsxFile = paste0("./comparisons_alpha_diversity_", params$my.interest, ".xlsx"), sheet = 2) comp_table <- comp_table %>% na.omit() %>% select(GROUP, contrast, estimate, p.value) %>% mutate(estimate = round(estimate, digits=2)) reactable(comp_table, theme = nytimes(centered = TRUE), compact = TRUE, defaultSortOrder = 'asc', defaultSorted = c('GROUP', 'contrast', 'p.value'), columns = list( GROUP = colDef( name = 'By Family', minWidth = 70, # searchable = TRUE, # searchMethod = ), contrast = colDef( name = 'comparison', minWidth = 70, # searchable = TRUE, # searchMethod = ), estimate = colDef( name = 'estimate', minWidth = 150, align = 'center', cell = data_bars( data = comp_table, text_position = 'outside-end', fill_color = c('#C40233','#127852'), number_fmt = scales::label_number(accuracy = 0.01) ) ), p.value = colDef( name = "Tukey's p.value", minWidth = 70, align = 'center', filterable = TRUE ) ) ) ``` ```{r} #| label: reactable_comp2 #| eval: true #| echo: false # https://glin.github.io/reactable/index.html # https://glin.github.io/reactable/articles/custom-filtering.html#table-search-methods # TO DO # - filter min p.value # - search contrast # color value pvalue on significance # remplir en gris lorsque p.value tukey est non signif # save # table from results of contrast # visualize estimate and p.value comp_table <- read.xlsx(xlsxFile = paste0("./comparisons_alpha_diversity_", params$my.interest, ".xlsx"), sheet = 3) comp_table <- comp_table %>% na.omit() %>% select(DAY, contrast, estimate, p.value) %>% mutate(estimate = round(estimate, digits=2)) reactable(comp_table, theme = nytimes(centered = TRUE), compact = TRUE, defaultSortOrder = 'asc', defaultSorted = c('DAY', 'contrast', 'p.value'), columns = list( DAY = colDef( name = 'By Family', minWidth = 70, # searchable = TRUE, # searchMethod = ), contrast = colDef( name = 'comparison', minWidth = 70, # searchable = TRUE, # searchMethod = ), estimate = colDef( name = 'estimate', minWidth = 150, align = 'center', cell = data_bars( data = comp_table, text_position = 'outside-end', fill_color = c('#C40233','#127852'), number_fmt = scales::label_number(accuracy = 0.01) ) ), p.value = colDef( name = "Tukey's p.value", minWidth = 70, align = 'center', filterable = TRUE ) ) ) ``` ```{r} #| label: reactable_allcontrast #| eval: true #| echo: false # https://glin.github.io/reactable/index.html # https://glin.github.io/reactable/articles/custom-filtering.html#table-search-methods # TO DO # - filter min p.value # - search contrast # color value pvalue on significance # remplir en gris lorsque p.value tukey est non signif # save # table from results of contrast # visualize estimate and p.value comp_table <- read.xlsx(xlsxFile = paste0("./comparisons_alpha_diversity_", params$my.interest, ".xlsx"), sheet = 1) comp_table <- comp_table %>% na.omit() %>% select(contrast, estimate, p.value) %>% mutate(estimate = round(estimate, digits=2)) # comp_table <- comp_table %>% # na.omit() %>% # select(contrast, estimate, p.value) %>% # mutate(estimate = round(estimate, digits=2)) %>% # mutate(log10p.value = -log10(p.value)) %>% # mutate(signif = case_when( # p.value < 0.05 ~ '#C40233', # p.value >= 0.05 ~ 'gray' # ) # ) reactable(comp_table, theme = nytimes(centered = TRUE), compact = TRUE, defaultSortOrder = 'asc', defaultSorted = 'p.value', columns = list( contrast = colDef( name = 'comparison', minWidth = 150, # searchable = TRUE, # searchMethod = ), estimate = colDef( name = 'estimate', minWidth = 150, align = 'center', cell = data_bars( data = comp_table, text_position = 'outside-end', fill_color = c('#C40233','#127852'), number_fmt = scales::label_number(accuracy = 0.01) ) ), p.value = colDef( name = "Tukey's p.value", minWidth = 70, align = 'center', filterable = TRUE ) ) ) ``` ## Beta-diversity The dissimilarity between samples based on taxonomic composition is calculated with one of these Beta diversities: Bray-Curtis, Unweighted and weighted Unifrac, Jaccard index. The Jaccard index uses presence/absence information only whereas UniFrac integrates information from the phylogenetic tree.    ```{r} #| label: distance_betadiv dist.jac <- phyloseq::distance(frogs_rare, method = "cc") dist.bc <- phyloseq::distance(frogs_rare, method = "bray") dist.uf <- phyloseq::distance(frogs_rare, method = "unifrac") dist.wuf <- phyloseq::distance(frogs_rare, method = "wunifrac") distance_list <- list( "Jaccard" = dist.jac, "Bray-Curtis" = dist.bc, "UniFrac" = dist.uf, "wUniFrac" = dist.wuf ) ``` ```{r} #| label: function_cowplot_MDS #| echo: true # Mahendra's function my_plot <- function(variable1, variable2, physeq = mydata_rare) { .plot <- function(distance) { .distance <- distance_list[[distance]] %>% as.matrix() %>% `[`(sample_names(physeq), sample_names(physeq)) %>% as.dist() p <- plot_ordination(physeq, ordinate(physeq, method = "MDS", distance = .distance), color = variable1, shape = variable2) + scale_color_brewer(palette = "Paired") + theme(legend.position = "none") + ggtitle(distance) # print(p) } plots <- lapply(names(distance_list), .plot) legend <- cowplot::get_legend(plots[[1]] + theme(legend.position = "bottom")) pgrid <- cowplot::plot_grid(plotlist = plots) cowplot::plot_grid(pgrid, legend, ncol = 1, rel_heights = c(1, .1)) } ``` ```{r} #| label: MDS_plot #| fig-height: 6 my_plot(variable1 = "DAY", variable2 = "GROUP", physeq = frogs_rare) ``` ::: {.callout-important} The Multidimensional scaling (MDS / PCoA) showed that samples were distributed according to the variable `DAY` on the first two axes. Samples from `J-6`, `J28` and `J51` are homogeneous, close to each other and opposite to samples from `J00` on the first axis. The others are distributed in both dimensions. ::: ## Beta-diversity - Tests ```{r} #| label: adonis #| warning: false metadata <- sample_data(frogs_rare) %>% as("data.frame") model <- vegan::adonis2(dist.bc ~ DAY*GROUP, data = metadata, permutations = 9999) print(model) # test with dist.jac, dist.uf, dist.wuf # model <- vegan::adonis2(dist.jac ~ DAY*GROUP, data = metadata, permutations = 9999) # print(model) ``` ::: {.callout-important} The change in microbiota composition measured by the `Bray-Curtis` Beta diversity is significantly different between `DAY`, `GROUP` and also between the levels of the interaction factor `DAY:GROUP` at 0.05 significance level. The influence of the factors differs between ecosystems.  ::: ```{r} #| label: clustering_samples par(mar = c(3, 0, 2, 0), cex = 0.8) phyloseq.extended::plot_clust(frogs_rare, dist = "bray", method = "ward.D2", color = "DAY", title = "Clustering of samples (Bray-Curtis + Ward)") ``` ::: {.callout-important} Hierarchical clustering based on the `Bray-Curtis` dissimilarity and the `Ward` aggregation criterion shows that sample composition is not entirely structured by `DAY`. Groups were formed according to the date of infection and the existing `DAY:GROUP` interaction . :::           At Family and Genus taxonomic levels, we created heatmap plot using ordination methods to organize the rows. ::: {.panel-tabset} ### Family ```{r} #| label: heatmap_ordination_Family #| fig-height: 10 # method = NULL if not ordination p <- phyloseq::plot_heatmap(tax_glom(frogs_rare, taxrank="Family"), method = "NMDS", distance = "bray", sample.order= "Label", # not ordination # taxa.order = "Family", taxa.label = "Family") + facet_grid(~ DAY, scales = "free_x", space = "free_x") + scale_fill_gradient2(low = "#1a9850", mid = "#ffffbf", high = "#d73027", na.value = "white", trans = scales::log_trans(4), midpoint = log(100, base = 4)) p ``` ### Genus ```{r} #| label: heatmap_ordination_Genus #| fig-height: 10 p <- phyloseq::plot_heatmap(tax_glom(frogs_rare, taxrank="Genus"), distance = "bray", method = "NMDS", sample.order= "Label", # not ordination # taxa.order = "Family", taxa.label="Genus") + facet_grid(~ DAY, scales = "free_x", space = "free_x") + scale_fill_gradient2(low = "#1a9850", mid = "#ffffbf", high = "#d73027", na.value = "white", trans = scales::log_trans(4), midpoint = log(100, base = 4)) p ``` ::: ::: {.callout-important} This comment provides from the complete study with all samples. As expected many bacteria disappeared at `J00` due to antiobitics. *Clostridoides* was absent at `J-6`, `J00`, `J28` and `J51`. It was present at `J02` and `J37`, and partially present during the evolution of the infection. Abundance profiles appeared fairly similar similar at `J-6`, `J00`, `J28` and `J51`. We observed differences for *Bifidobacterium*, *28-4*, *Oscillibacter*, and *Prevotellaceae UCG-001*. ::: ```{r} #| label: heatmap_brouillon #| echo: false #| eval: false # voir une représentation uniquement sur les ASV diff # pensez à la visualisation ggtree # Phylum=="Bacteroidota", Phylum=="Firmicutes" p <- plot_heatmap(subset_taxa(frogs_rare, Phylum=="Firmicutes"), distance = "bray", method = "NMDS", sample.order= "Label", # not ordination taxa.label="Genus" ) + facet_grid(~ DAY, scales = "free_x", space = "free_x") + scale_fill_gradient2(low = "#1a9850", mid = "#ffffbf", high = "#d73027", na.value = "white", trans = scales::log_trans(4), midpoint = log(100, base = 4)) p ``` # Reproducibility token ```{r} sessioninfo::session_info(pkgs = "attached") ``` # References