Application of stable isotope probing to identify temperature responsive bacterial taxa at the level of 3-hydroxy fatty acid synthesis: Time T0, T14, T28

in progress
collaboration
Authors
Affiliation

Olivier Rué

Migale bioinformatics facility

Christelle Hennequet-Antier

Migale bioinformatics facility

Published

November 22, 2023

Modified

February 19, 2025

Note

This document is a report of the analyses performed. You will find all the code used to analyze these data. The version of the tools (maybe in code chunks) and their references are indicated, for questions of reproducibility.

Aim of the project

The aim of the project is to provide a better understanding of how soil bacteria respond to temperature variations at the membrane level. Bacterial lipids can be used in paleoclimatology to reconstruct temperature of past climates in marine and terrestrial archives. This involves the use of proxies and calculations that are based on the relative amounts of these lipids. In soil, only one organic proxy is available, mainly due to the difficulty to work on soil, which is much more heterogeneous in structure than aquatic environments. A drawback of this sungle proxy is the error which is associated with it as well as the lack of confirmation of the result by a different approach. A new proxy based on 3-hydroxy fatty acids has been proposed in 2016. To better understand and validate this proxy, as well as to investigate new more precise proxies, we aim at understanding what soil bacterial species contribute to the proposed proxy by investigating their response te temperature at the taxonomic and lipid levels. We have used a DNA- and lipid-SIP approach during a time-course experiment and are now at the stage of analyzing what species have used the 13C labelled substrate in a temperature dependent manner. The profile of the enriched taxa in the 13C fraction will be compared in the different samples (different temperature, different incubation time) and will be compared with the 13C-enriched lipid profiles in the same samples.

Partners

  • Olivier Rué - Migale bioinformatics facility - BioInfomics - INRAE
  • Christelle Hennequet-Antier - Migale bioinformatics facility - BioInfomics - INRAE
  • Sylvie Collin - Sorbonne Université

Deliverables

Deliverables agreed at the preliminary meeting (Table 1).

Table 1: Deliverables
  Definition
1 HTML report

Data management

Important

All data is managed by the migale facility for the duration of the project. Once the project is over, the Migale facility does not keep your data. We will provide you with the raw data and associated metadata that will be deposited on public repositories before the results are used. We can guide you in the submission process. We will then decide which files to keep, knowing that this report will also be provided to you and that the analyses can be replayed if needed.

Biostatistics

The phyloseq package [1] is a tool to import, store, analyze, and graphically display complex metabarcoding data, especially when there is associated sample data, phylogenetic tree, and/or taxonomic assignment of the OTUs or ASVs. Various customs functions written to enhance the base functions of phyloseq are available in the phyloseq-extended package [2].

The available metadata are:

Show the code
metadata <- read_tsv("../data/metadata.tsv", 
                     col_types = cols(SampleName = col_character(), 
                                      Time = col_factor(levels = c("T0", "T3", "T7", "T14", "T28"), ordered = TRUE), 
                                      Temperature = col_factor(levels = c("5", "15", "25"), ordered = TRUE),
                                      Replicate = col_factor(levels = c("1", "2", "3"), ordered = TRUE)
                     )
) %>%
column_to_rownames(var="SampleName")
 
metadata <- metadata %>% 
  mutate(
    Time_Gluc = case_when(
      Time == "T3" ~ paste(Time, Glucose, sep="_"),
      Time == "T7" ~ paste(Time, Glucose, sep="_"),
      .default = Time
    )) %>%
  mutate(Time_Gluc = factor(Time_Gluc,
                            levels = c("T0", "T3_12C", "T3_13C", "T7_12C", "T7_13C", "T14", "T28"))) %>%
  mutate(Group = paste(Time_Gluc, Temperature, sep="_")) %>%
  mutate(Group = factor(Group, 
    levels = c("T0_5", "T0_15", "T0_25", "T3_12C_5", "T3_13C_5", "T3_12C_15", "T3_13C_15", "T3_12C_25", "T3_13C_25", "T7_12C_5", "T7_13C_5", "T7_12C_15", "T7_13C_15", "T7_12C_25", "T7_13C_25", "T14_5", "T14_15", "T14_25", "T28_5", "T28_15", "T28_25"))
  ) %>% 
  mutate(
    SampleLabel = case_when(
      is.na(FractionOnCsClGradient) == TRUE ~ paste(Group, Replicate, sep="_"),
    .default = paste(Group, FractionOnCsClGradient, Replicate, sep="_"))
)

#spec(metadata)
# metadata %>% 
#   datatable()
Show the code
biomfile <- paste0("../html/","affiliation.biom")
physeq <- import_frogs(biomfile, taxMethod = "blast")
sample_data(physeq) <- metadata
phy_tree(physeq) <- phyloseq::read_tree("../html/tree.nwk")
physeq <- physeq %>% microViz::ps_arrange(Time_Gluc, Temperature, FractionOnCsClGradient)
# visualisation and save
metadata %>% 
  datatable()
Show the code
microbiome::summarize_phyloseq(physeq)
[[1]]
[1] "1] Min. number of reads = 10643"

[[2]]
[1] "2] Max. number of reads = 48037"

[[3]]
[1] "3] Total number of reads = 4010689"

[[4]]
[1] "4] Average number of reads = 29708.8074074074"

[[5]]
[1] "5] Median number of reads = 31149"

[[6]]
[1] "7] Sparsity = 0.218730882381593"

[[7]]
[1] "6] Any OTU sum to 1 or less? NO"

[[8]]
[1] "8] Number of singletons = 0"

[[9]]
[1] "9] Percent of OTUs that are singletons \n        (i.e. exactly one read detected across all samples)0"

[[10]]
[1] "10] Number of sample variables are: 9"

[[11]]
[1] "Time"                   "Temperature"            "SampleID"              
[4] "Replicate"              "FractionOnCsClGradient" "Glucose"               
[7] "Time_Gluc"              "Group"                  "SampleLabel"           
Show the code
saveRDS(object = physeq, file="physeq.rds")
Show the code
# count number of samples per condition
sample_data(physeq) %>% 
  as_tibble() %>% 
  dplyr::count(Time) 
# A tibble: 5 × 2
  Time      n
  <ord> <int>
1 T0        9
2 T3       54
3 T7       54
4 T14       9
5 T28       9
Show the code
sample_data(physeq) %>% 
  as_tibble() %>% 
  dplyr::add_count(Time, Temperature) %>%
  dplyr::select(Time, Temperature, Group, n) %>%
  distinct() %>%
  ggplot(., aes(x = Group, y = n, fill = Time)) + 
  geom_bar(stat = "identity") +
  labs(title="Number of replicates", y = "n") +
  theme(axis.title.x = element_blank(),
        axis.text.x = element_text(angle = 90, size = 8, hjust = 1))

Show the code
# frequencies
#microbiome::plot_frequencies(sample_data(physeq), "Time", "Group")
Note

For analysis purposes, the variable Time_Gluc is created as the combination of the Time and Glucose variables.

Select samples from Time T0, T14 and T28

Show the code
physeq <- physeq %>% microViz::ps_filter(Time %in% c("T0", "T14", "T28")) 

Rarefaction

The common practice for comparing microbiome samples of different library sizes is to reduce the size of the larger samples using rarefaction until they contain the same number of reads as the smallest sample. The rarefaction curves show the species richness (i.e. number of species) against the library size and the vertical grey line indicates the smallest sample’s size. Each curve indicates when the sampling depth is sufficient to achieve an acceptable number of species discoveries and when increasing the depth leads to the discovery of rarer species. Rarefied data are used to estimate and compare alpha and beta diversity across samples.

Show the code
phyloseq::plot_bar(physeq, x = "SampleLabel", fill = "Phylum") + 
  facet_grid(~ Time_Gluc + Temperature, scales = "free_x", space = "free") +
 # facet_grid(~Time + Temperature + Glucose, scales = "free_x", space = "free") +
  theme(axis.text.x = element_text(size = 6, hjust = 1)) +
  theme(legend.position="top")

Show the code
# sample_sums(physeq) %>% sort() %>% head(n=5)
# remove samples with less than nbreads_threshold 
nbreads_threshold  <- 11000
sample_sums(physeq)[!sample_sums(physeq) > nbreads_threshold]
MOU32-2-5-T14X10 
           10881 
Show the code
physeq <- microViz::ps_filter(physeq, sample_sums(physeq) > nbreads_threshold)
Show the code
ggrare(physeq = physeq, step = 500, se = FALSE, color = "Group", plot = FALSE, label = "SampleLabel", verbose = FALSE) + 
  # scale_colour_manual(values = group_pal) +
  geom_vline(xintercept = c(min(sample_sums(physeq))), color = "gray60") + 
  ggtitle("Rarefaction curves")

Show the code
physeq_rare <- phyloseq::rarefy_even_depth(physeq, rngseed = 2002, replace = TRUE)
Show the code
phyloseq::plot_bar(physeq_rare, fill="Phylum") + 
  facet_grid(~ Time_Gluc + Temperature, scales = "free_x", space = "free") +
  theme(axis.text.x = element_text(size = 6, hjust = 1)) +
  theme(legend.position="top")

Results

One sample was removed due to a low number of reads, less than 11000.

After rarefaction, all depths are equal to 12251. The dataset contains 318526 sequences allocated to 1877 taxa and 26 samples.

Taxonomic composition

Show the code
phyloseq.extended::plot_composition(physeq = physeq_rare, taxaRank1 = "Kingdom", taxaSet1 = "Bacteria", taxaRank2 = "Phylum", numberOfTaxa = 10L) +
  facet_grid(~ Time_Gluc + Temperature, scales = "free_x", space = "free") +
  theme(axis.text.x = element_text(size = 6, hjust = 1)) +
  theme(legend.position="top")

Show the code
phyloseq.extended::plot_composition(physeq = physeq_rare, taxaRank1 = "Kingdom", taxaSet1 = "Bacteria", taxaRank2 = "Family", numberOfTaxa = 10L) +
  facet_grid(~ Time_Gluc + Temperature, scales = "free_x", space = "free") +
  theme(axis.text.x = element_text(size = 6, hjust = 1)) +
  theme(legend.position="top")

Results

Actinobacteriota and Proteobacteria are dominants in major samples. Firmicutes appear mainly at Time > T14 and Temperature > 15. Bacteroidota are few abundants in absence of Glucose and are dominated by Flavobacterium genus.

Focus on the genus rank across these major phyla.

Show the code
phyloseq.extended::plot_composition(physeq = physeq_rare, taxaRank1 = "Phylum", taxaSet1 = "Actinobacteriota", taxaRank2 = "Genus") +
  facet_grid( ~ Time_Gluc + Temperature, scales = "free_x", space = "free_x") + 
  labs(title = "Relative composition", subtitle = "Actinobacteriota")
Problematic taxa
                   taxa  Kingdom           Phylum           Class
Cluster_46   Cluster_46 Bacteria Actinobacteriota Thermoleophilia
Cluster_44   Cluster_44 Bacteria Actinobacteriota Thermoleophilia
Cluster_26   Cluster_26 Bacteria Actinobacteriota       MB-A2-108
Cluster_142 Cluster_142 Bacteria Actinobacteriota  Acidimicrobiia
Cluster_264 Cluster_264 Bacteria Actinobacteriota  Acidimicrobiia
                          Order             Family   Genus rank
Cluster_46  Solirubrobacterales              67-14 Unknown    3
Cluster_44           Gaiellales            Unknown Unknown    4
Cluster_26              Unknown            Unknown Unknown    5
Cluster_142           IMCC26256            Unknown Unknown    8
Cluster_264      Microtrichales Ilumatobacteraceae Unknown    9

Show the code
phyloseq.extended::plot_composition(physeq = physeq_rare, taxaRank1 = "Phylum", taxaSet1 = "Proteobacteria", taxaRank2 = "Genus") +
  facet_grid( ~ Time_Gluc + Temperature, scales = "free_x", space = "free_x") + 
  labs(title = "Relative composition", subtitle = "Proteobacteria")
Problematic taxa
                   taxa  Kingdom         Phylum               Class
Cluster_6     Cluster_6 Bacteria Proteobacteria Alphaproteobacteria
Cluster_12   Cluster_12 Bacteria Proteobacteria Alphaproteobacteria
Cluster_16   Cluster_16 Bacteria Proteobacteria Alphaproteobacteria
Cluster_27   Cluster_27 Bacteria Proteobacteria Alphaproteobacteria
Cluster_143 Cluster_143 Bacteria Proteobacteria Gammaproteobacteria
                      Order             Family             Genus rank
Cluster_6       Rhizobiales  Xanthobacteraceae           Unknown    1
Cluster_12      Rhizobiales Methyloligellaceae           Unknown    2
Cluster_16      Rhizobiales  Xanthobacteraceae Multi-affiliation    3
Cluster_27       Elsterales            Unknown           Unknown    7
Cluster_143 Burkholderiales            SC-I-84           Unknown    8

Show the code
phyloseq.extended::plot_composition(physeq = physeq_rare, taxaRank1 = "Phylum", taxaSet1 = "Firmicutes", taxaRank2 = "Genus") +
  facet_grid( ~ Time_Gluc + Temperature, scales = "free_x", space = "free_x") + 
  labs(title = "Relative composition", subtitle = "Firmicutes")

Show the code
phyloseq.extended::plot_composition(physeq = physeq_rare, taxaRank1 = "Phylum", taxaSet1 = "Bacteroidota", taxaRank2 = "Genus") +
  facet_grid( ~ Time_Gluc + Temperature, scales = "free_x", space = "free_x") + 
  labs(title = "Relative composition", subtitle = "Bacteroidota")
Problematic taxa
                   taxa  Kingdom       Phylum       Class           Order
Cluster_209 Cluster_209 Bacteria Bacteroidota Bacteroidia Chitinophagales
Cluster_388 Cluster_388 Bacteria Bacteroidota Bacteroidia    Cytophagales
                      Family   Genus rank
Cluster_209 Chitinophagaceae Unknown    2
Cluster_388  Microscillaceae Unknown    3

Alpha-diversity

Alpha-diversity

Alpha-diversity represents diversity within a sample (i.e. ecosystem). The most commonly used measures of diversity are : Observed, Chao1, Shannon, and Simpson. Richness (i.e. Observed) is simply the number of species present in the sample and Shannon is another measure based on species richness. Chao1 is an estimator of the number of species applied to the presence/absence data for each species. The Simpson index gives more importance to abundant species.

Show the code
# plot alpha diversity
phyloseq::plot_richness(physeq = physeq_rare, x = "Group", color= "Group", title = "Alpha diversity graphics", measures = c("Observed", "Chao1", "Shannon", "Simpson", "InvSimpson"), nrow=5) + 
  geom_jitter() +
  scale_fill_brewer(palette = "Paired") +
  theme(axis.text.x = element_text(size = 6, angle = 45, hjust = 1)) +
  theme(legend.position="top")

Show the code
# plot alpha diversity 
p.alphadiversity <- microbiome::boxplot_alpha(physeq_rare, 
                           index = c("Observed"),
                           x_var = "Group",
                           )
p.alphadiversity +
    facet_grid(cols = vars(Time_Gluc, Temperature), scales = "free_x", space = "free_x") + 
    theme(axis.text.x = element_text(size = 6, angle = 45, hjust = 1)) +
    theme(legend.position="top")

Show the code
# plot alpha diversity 
p.alphadiversity <- microbiome::boxplot_alpha(physeq_rare, 
                           index = c("Chao1"),
                           x_var = "Group",
                           )
p.alphadiversity +
    facet_grid(cols = vars(Time_Gluc, Temperature), scales = "free_x", space = "free_x") + 
    theme(axis.text.x = element_text(size = 6, angle = 45, hjust = 1)) +
    theme(legend.position="top")

Show the code
# plot alpha diversity 
p.alphadiversity <- microbiome::boxplot_alpha(physeq_rare, 
                           index = c("Shannon"),
                           x_var = "Group",
                           )
p.alphadiversity +
    facet_grid(cols = vars(Time_Gluc, Temperature), scales = "free_x", space = "free_x") + 
    theme(axis.text.x = element_text(size = 6, angle = 45, hjust = 1)) +
    theme(legend.position="top")

Show the code
# plot alpha diversity 
p.alphadiversity <- microbiome::boxplot_alpha(physeq_rare, 
                           index = c("Simpson"),
                           x_var = "Group",
                           )
p.alphadiversity +
    facet_grid(cols = vars(Time_Gluc, Temperature), scales = "free_x", space = "free_x") + 
    theme(axis.text.x = element_text(size = 6, angle = 45, hjust = 1)) +
    theme(legend.position="top")

Results

The alpha diversities are quite similar between samples at any time except T14. The observed diversity fall from Temperature 15 and 25 at T14.

Alpha-diversity - statistical tests

We calculated the alpha-diversity indices and combined them with covariates from experimental design. We tested the influence Time and Temperature on Observed alpha-diversity.

Show the code
div_data <- cbind(estimate_richness(physeq_rare),  ## diversity indices
                  sample_data(physeq_rare)         ## covariates
                  )

Model with interaction

We tested the influence Time and Temperature on Observed alpha-diversity.

Show the code
alphadiv.lm <- lm(Observed ~ Time + Temperature, data = div_data)
anova(alphadiv.lm)
Analysis of Variance Table

Response: Observed
            Df Sum Sq Mean Sq F value Pr(>F)
Time         2   6027  3013.5  1.0694 0.3612
Temperature  2   5270  2634.8  0.9350 0.4083
Residuals   21  59177  2818.0               
Show the code
# interaction visualisation
emmip(alphadiv.lm, Time ~ Temperature, CIs = TRUE)

Results

Richness (Observed diversity) is not significantly different by Time and Temperature.

Beta-diversity

The dissimilarity between samples based on taxonomic composition is calculated with one of these Beta diversities: Bray-Curtis, Unweighted and weighted Unifrac, Jaccard index. The Jaccard index uses presence/absence information only whereas UniFrac integrates information from the phylogenetic tree.

Show the code
dist.jac <- phyloseq::distance(physeq_rare, method = "cc")
dist.bc <- phyloseq::distance(physeq_rare, method = "bray")
dist.uf <- phyloseq::distance(physeq_rare, method = "unifrac")
dist.wuf <- phyloseq::distance(physeq_rare, method = "wunifrac")
distance_list <- list(
  "Jaccard"     = dist.jac,
  "Bray-Curtis" = dist.bc,
  "UniFrac"     = dist.uf,
  "wUniFrac"    = dist.wuf
)
Show the code
# Mahendra's function
my_plot <- function(variable1, variable2, physeq = mydata_rare) {
  .plot <- function(distance) {
    .distance <- distance_list[[distance]] %>% as.matrix() %>% `[`(sample_names(physeq), sample_names(physeq)) %>% as.dist()
    p <- plot_ordination(physeq,
                    ordinate(physeq, method = "MDS", distance = .distance),
                    color = variable1, shape = variable2) + 
      scale_color_brewer(palette = "Paired") +
      theme(legend.position = "none") +
      ggtitle(distance)
    # print(p)
  }
  plots <- lapply(names(distance_list), .plot)
  legend <- cowplot::get_legend(plots[[1]] + theme(legend.position = "bottom"))
  pgrid <- cowplot::plot_grid(plotlist = plots)
  cowplot::plot_grid(pgrid, legend, ncol = 1, rel_heights = c(1, .1))
}

# Mahendra's function
# used own color's palette and shape
my_plot2 <- function(variable1, variable2, physeq = mydata_rare, shapeval, colorpal) {
  .plot <- function(distance) {
    .distance <- distance_list[[distance]] %>% as.matrix() %>% `[`(sample_names(physeq), sample_names(physeq)) %>% as.dist()
    p <- plot_ordination(physeq,
                    ordinate(physeq, method = "MDS", distance = .distance),
                    color = variable1, shape = variable2) + 
      scale_shape_manual(values = shapeval) +
      scale_color_manual(name = colorpal$name,
                         labels = colorpal$labels,
                         values = colorpal$values
                     ) +
      theme(legend.position = "none") +
      ggtitle(distance)
    # print(p)
  }
  plots <- lapply(names(distance_list), .plot)
  legend <- cowplot::get_legend(plots[[1]] + theme(legend.position = "bottom"))
  pgrid <- cowplot::plot_grid(plotlist = plots)
  cowplot::plot_grid(pgrid, legend, ncol = 1, rel_heights = c(1, .1))
}

# Mahendra's function
# used own color's palette and shape
my_pcoa_with_arrows2 <- function(physeq = mydata_rare, variable1, variable2, shapeval, colorpal) {
  # select samples of interest where all samples were used to calculate the distances between samples 
  # dist.jac is previously calculated with all samples of the study
  #dist.c <- phyloseq::distance(physeq, "jaccard")
  dist.c <- dist.bc %>% as.matrix() %>% `[`(sample_names(physeq), sample_names(physeq)) %>% as.dist()
  ord.c <- ordinate(physeq, "MDS", dist.bc)
  p <- plot_ordination(physeq, ord.c)
  # ## Build arrow data
  plot_data <- p$data 
  arrow_data <- plot_data %>% 
    group_by(pick({{variable2}}, {{variable1}})) %>%
    arrange({{variable1}}) %>%
    mutate(xend = Axis.1, xstart = lag(xend),
           yend = Axis.2, ystart = lag(yend)
    )
  ## Plot with arrows
  p + aes(shape = .data[[ variable2  ]], color = .data[[ variable1  ]]) +
      scale_shape_manual(values = shapeval) +
      scale_color_manual(name = colorpal$name,
                         labels = colorpal$labels,
                         values = colorpal$values) +
    theme_bw() +
    geom_segment(data = arrow_data,
                 aes(x = xstart, y = ystart, xend = xend, yend = yend),
                 size = 0.8, linetype = "3131",
            #     arrow = arrow(length=unit(0.1,"inches")),
                 show.legend = FALSE) +
    geom_point(size = 3) +
    theme(legend.position = "bottom")
}
Show the code
# MDS plot for samples of interests 
# from distance calculated with all samples of the complete study
# used https://colorbrewer2.org/#type=qualitative&scheme=Paired&n=12

# Time_pal <- list(
#   name = "Time",
#   labels = c("T0",  "T3",  "T7",  "T14", "T28"),
#   values = c("#1f78b4", "#b2df8a", "#984ea3", "#ff7f00", "#a65628")
# )

Time_pal <- list(
  name = "Time",
  labels = c("T0","T14", "T28"),
  values = c("#1f78b4", "#ff7f00", "#a65628")
)

Time_Gluc_pal <- list(
  name = "Time_Gluc",
  labels = c("T0", "T3_12C", "T3_13C", "T7_12C", "T7_13C", "T14", "T28"),
  values = c("#9ecae1", "#b2df8a", "#33a02c", "#cab2d6", "#984ea3", "#4292c6", "#08519c")
)
Temperature_shape <- list(values = c(16,17,15))

my_plot2(variable1 = "Time",
        variable2 = "Temperature",
        physeq = physeq_rare,
        shapeval = Temperature_shape$values,
        colorpal = Time_pal
        )

Show the code
# MDS plot for samples of interests 
# from distance calculated with all samples of the complete study
# used https://colorbrewer2.org/#type=qualitative&scheme=Paired&n=12

p <- my_pcoa_with_arrows2(variable1 = "Time",
        variable2 = "Temperature",
        physeq = physeq_rare,
        shapeval = Temperature_shape$values,
        colorpal = Time_pal)
p

MDS plot

The Multidimensional scaling (MDS / PCoA) based on Bray-Curtis distance showed that samples were distributed according to Time and Temperature variables on the first two axes. The replicates samples T0, T14 and T28 are homogeneous, close to each other.

Beta-diversity - Tests

Show the code
metadata <- sample_data(physeq_rare) %>% as("data.frame")
# assess significance for each term sequentially from first to last
# importance of the order of the introduced factors in the model
model <- vegan::adonis2(dist.bc ~ Time*Temperature, data = metadata, permutations = 9999, by = "terms")
print(model)
Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 9999

vegan::adonis2(formula = dist.bc ~ Time * Temperature, data = metadata, permutations = 9999, by = "terms")
                 Df SumOfSqs      R2      F Pr(>F)    
Time              2  0.35165 0.24857 5.5264 0.0001 ***
Temperature       2  0.24348 0.17211 3.8264 0.0001 ***
Time:Temperature  4  0.27870 0.19701 2.1900 0.0028 ** 
Residual         17  0.54086 0.38232                  
Total            25  1.41469 1.00000                  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Results

The change in microbiota composition measured by the Bray-Curtis Beta diversity is significantly different between Time, and Temperature and also between the levels of the interaction factor Time:Temperature at 0.05 significance level. The influence of the factors differs between ecosystems.

Show the code
par(mar = c(3, 0, 2, 0), cex = 0.6)
phyloseq.extended::plot_clust(physeq_rare, dist = "bray", method = "ward.D2", color = "Group",
           title = "Clustering of samples (Bray-Curtis + Ward)")

Results

Hierarchical clustering based on the Bray-Curtis dissimilarity and the Ward aggregation criterion shows that sample composition is not entirely structured by Time, Glucose and Temperature.

At Phylum taxonomic levels, we created heatmap plot using ordination methods to organize the rows.

Family

Show the code
#  method = NULL if not ordination
p <- phyloseq::plot_heatmap(tax_glom(physeq_rare, taxrank="Phylum"), 
                  method = "MDS",
                  distance = "bray",
                  sample.order= "SampleLabel", # not ordination
                 # taxa.order = "Family",
                  taxa.label = "Phylum") + 
  facet_grid(~ Time_Gluc, scales = "free_x", space = "free_x") + 
  scale_fill_gradient2(low = "#1a9850", mid = "#ffffbf", high = "#d73027",
                       na.value = "white", trans = scales::log_trans(4),
                       midpoint = log(100, base = 4))
p

Reproducibility token

Show the code
sessioninfo::session_info(pkgs = "attached")
─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.4.2 (2024-10-31)
 os       Ubuntu 24.04.1 LTS
 system   x86_64, linux-gnu
 ui       X11
 language (EN)
 collate  fr_FR.UTF-8
 ctype    fr_FR.UTF-8
 tz       Europe/Paris
 date     2025-02-18
 pandoc   3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)

─ Packages ───────────────────────────────────────────────────────────────────
 package                  * version date (UTC) lib source
 Biobase                  * 2.64.0  2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 BiocGenerics             * 0.50.0  2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 Biostrings               * 2.72.1  2024-06-02 [1] Bioconductor 3.19 (R 4.4.0)
 ComplexHeatmap           * 2.20.0  2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 data.table               * 1.16.4  2024-12-06 [1] CRAN (R 4.4.2)
 dplyr                    * 1.1.4   2023-11-17 [1] CRAN (R 4.4.0)
 DT                       * 0.33    2024-04-04 [1] CRAN (R 4.4.0)
 emmeans                  * 1.10.7  2025-01-31 [1] CRAN (R 4.4.2)
 forcats                  * 1.0.0   2023-01-29 [1] CRAN (R 4.4.0)
 GenomeInfoDb             * 1.40.1  2024-05-24 [1] Bioconductor 3.19 (R 4.4.0)
 GenomicRanges            * 1.56.2  2024-10-09 [1] Bioconductor 3.19 (R 4.4.1)
 ggplot2                  * 3.5.1   2024-04-23 [1] CRAN (R 4.4.0)
 ggtree                   * 3.12.0  2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 ggtreeExtra              * 1.14.0  2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 InraeThemes              * 3.0.0   2024-05-21 [1] Github (davidcarayon/InraeThemes@9ee5259)
 IRanges                  * 2.38.1  2024-07-03 [1] Bioconductor 3.19 (R 4.4.1)
 kableExtra               * 1.4.0   2024-01-24 [1] CRAN (R 4.4.0)
 lubridate                * 1.9.4   2024-12-08 [1] CRAN (R 4.4.2)
 MatrixGenerics           * 1.16.0  2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 matrixStats              * 1.5.0   2025-01-07 [1] CRAN (R 4.4.2)
 metacoder                * 0.3.7   2024-10-03 [1] Github (grunwaldlab/metacoder@54ef8ea)
 mia                      * 1.15.6  2025-01-07 [1] Github (microbiome/mia@2dc3430)
 MicrobiotaProcess        * 1.16.1  2024-07-28 [1] Bioconductor 3.19 (R 4.4.1)
 MultiAssayExperiment     * 1.30.3  2024-07-10 [1] Bioconductor 3.19 (R 4.4.2)
 openxlsx                 * 4.2.8   2025-01-25 [1] CRAN (R 4.4.2)
 phyloseq                 * 1.48.0  2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)
 phyloseq.extended        * 0.1.4.1 2024-05-21 [1] Github (mahendra-mariadassou/phyloseq-extended@150a871)
 purrr                    * 1.0.2   2023-08-10 [1] CRAN (R 4.4.0)
 readr                    * 2.1.5   2024-01-10 [1] CRAN (R 4.4.0)
 S4Vectors                * 0.42.1  2024-07-03 [1] Bioconductor 3.19 (R 4.4.1)
 scales                   * 1.3.0   2023-11-28 [1] CRAN (R 4.4.0)
 SingleCellExperiment     * 1.26.0  2024-04-30 [1] Bioconductor 3.19 (R 4.4.2)
 stringr                  * 1.5.1   2023-11-14 [1] CRAN (R 4.4.0)
 SummarizedExperiment     * 1.34.0  2024-05-01 [1] Bioconductor 3.19 (R 4.4.0)
 tibble                   * 3.2.1   2023-03-20 [1] CRAN (R 4.4.0)
 tidyr                    * 1.3.1   2024-01-24 [1] CRAN (R 4.4.0)
 tidyverse                * 2.0.0   2023-02-22 [1] CRAN (R 4.4.0)
 TreeSummarizedExperiment * 2.12.0  2024-04-30 [1] Bioconductor 3.19 (R 4.4.2)
 viridis                  * 0.6.5   2024-01-29 [1] CRAN (R 4.4.0)
 viridisLite              * 0.4.2   2023-05-02 [1] CRAN (R 4.4.0)
 XVector                  * 0.44.0  2024-04-30 [1] Bioconductor 3.19 (R 4.4.0)

 [1] /home/orue/R/x86_64-pc-linux-gnu-library/4.4
 [2] /usr/local/lib/R/site-library
 [3] /usr/lib/R/site-library
 [4] /usr/lib/R/library

──────────────────────────────────────────────────────────────────────────────

References

1. McMurdie PJ, Holmes S. Phyloseq: An r package for reproducible interactive analysis and graphics of microbiome census data. PloS one. 2013;8:e61217.
2. Mariadassou M. Phyloseq-extended: Various customs functions written to enhance the base functions of phyloseq. 2018. https://github.com/mahendra-mariadassou/phyloseq-extended.

Reuse

This document will not be accessible without prior agreement of the partners

A work by Migale Bioinformatics Facility
Université Paris-Saclay, INRAE, MaIAGE, 78350, Jouy-en-Josas, France
Université Paris-Saclay, INRAE, BioinfOmics, MIGALE bioinformatics facility, 78350, Jouy-en-Josas, France