Show the code
cd /home/orue/work/PROJECTS/RESTORBIOME2/
The aim of this project is to study the transcriptional response of bacteria to complex sugars added to the culture medium. The experiments are carried out on pure culture of microorganisms whose genomes are known.
This document is a report of the analyses performed. You will find all the code used to analyze these data. The version of the tools (maybe in code chunks) and their references are indicated, for questions of reproducibility.
The aim of this project is to study the transcriptional response of bacteria to complex sugars added to the culture medium. The experiments are carried out on pure culture of microorganisms whose genomes are known.
Deliverables agreed at the preliminary meeting (Table 1).
Definition | |
---|---|
1 | HTML report |
2 | Archive containing data to be stored |
All data is managed by the migale facility for the duration of the project. Once the project is over, the Migale facility does not keep your data. We will provide you with the raw data and associated metadata that will be deposited on public repositories before the results are used. We can guide you in the submission process. We will then decide which files to keep, knowing that this report will also be provided to you and that the analyses can be replayed if needed.
Genomes and mapping data were sent by Paul and deposited on the front
server. The genome of E. rectale and its annotation was found on the NCBI website and downloaded. A copy was sent to the abaca
server.
cd /home/orue/work/PROJECTS/RESTORBIOME2/
sed "s/lcl_NC_008618_1_prot_//g" CAZYMES/BaATCC15703.txt > CAZYMES/BaATCC15703_refac.txt
sed -i "s/_1_/\.1_/g" CAZYMES/BaATCC15703_refac.txt
python add_cazymes_to_gff3.py --gff ref/NC_008618_bifido_adolescentis_spikes.gff3 --cazy CAZYMES/BaATCC15703_refac.txt --output NC_008618_bifido_adolescentis_spikes_cazymes.gff3
sed "s/lcl_NZ_AP012325_1_prot_//g" CAZYMES/BcDSM16992.txt > CAZYMES/BcDSM16992_refac.txt
sed -i "s/_1_/\.1_/g" CAZYMES/BcDSM16992_refac.txt
python add_cazymes_to_gff3.py --gff ref/NZ_AP012325_bifido_catenulatum_spikes.gff3 --cazy CAZYMES/BcDSM16992_refac.txt --output NZ_AP012325_bifido_catenulatum_spikes_cazymes.gff3
sed "s/lcl_CP102263_1_prot_//g" CAZYMES/BuATCC8492.txt > CAZYMES/BuATCC8492_refac.txt
sed -i "s/_1_/\.1_/g" CAZYMES/BuATCC8492_refac.txt
python add_cazymes_to_gff3.py --gff ref/CP102263_1_Bacteroides_uniformis_spikes.gff3 --cazy CAZYMES/BuATCC8492_refac_ok.txt --output CP102263_1_Bacteroides_uniformis_spikes_cazymes.gff3 --genome BU
sed "s/gene_//g" CAZYMES/ErATCC33656.txt > CAZYMES/ErATCC33656_refac.txt
#sed -i "s/_1_/\.1_/g" CAZYMES/ErATCC33656_refac.txt
python add_cazymes_to_gff3.py --gff ref/NC_012781_eubacterium_rectale_spikes.gff3 --cazy CAZYMES/ErATCC33656_refac.txt --output NC_012781_eubacterium_rectale_spikes_cazymes.gff3 --genome ER
conda activate run_dbcan-3.0.2
run_dbcan ~/work/PROJECTS/RESTORBIOME2/ref/NC_008618_bifido_adolescentis_spikes.fasta prok --out_dir DBCAN_BA --db_dir /db/outils/dbcan/ --tools all --hmm_cpu 16 --eCAMI_jobs 16 --cluster NC_008618_bifido_adolescentis_spikes_cazymes.gff3
python ~/save/SCRIPTS/add_dbcan_to_gff3.py --gff NC_008618_bifido_adolescentis_spikes_cazymes.gff3 --overview DBCAN_BA/overview.txt --gff-prodigal DBCAN_BA/prodigal.gff --output NC_008618_bifido_adolescentis_spikes_cazymes_dbcan.gff3 --cgc-file DBCAN_BA/cgc_standard.out > BA_dbcan.log
run_dbcan ~/work/PROJECTS/RESTORBIOME2/ref/CP102263_1_Bacteroides_uniformis_spikes.fasta prok --out_dir DBCAN_BU --db_dir /db/outils/dbcan/ --tools all --hmm_cpu 16 --eCAMI_jobs 16 --cluster CP102263_1_Bacteroides_uniformis_spikes_cazymes.gff3
python ~/save/SCRIPTS/add_dbcan_to_gff3.py --gff CP102263_1_Bacteroides_uniformis_spikes_cazymes.gff3 --overview DBCAN_BU/overview.txt --gff-prodigal DBCAN_BU/prodigal.gff --output CP102263_1_Bacteroides_uniformis_spikes_cazymes_dbcan.gff3 --cgc-file DBCAN_BU/cgc_standard.out > BU_dbcan.log
run_dbcan ~/work/PROJECTS/RESTORBIOME2/ref/NZ_AP012325_bifido_catenulatum_spikes.fasta prok --out_dir DBCAN_BC --db_dir /db/outils/dbcan/ --tools all --hmm_cpu 16 --eCAMI_jobs 16 --cluster NZ_AP012325_bifido_catenulatum_spikes_cazymes.gff3
python ~/save/SCRIPTS/add_dbcan_to_gff3.py --gff NZ_AP012325_bifido_catenulatum_spikes_cazymes.gff3 --overview DBCAN_BC/overview.txt --gff-prodigal DBCAN_BC/prodigal.gff --output NZ_AP012325_bifido_catenulatum_spikes_cazymes_dbcan.gff3 --cgc-file DBCAN_BC/cgc_standard.out > BC_dbcan.log
run_dbcan ~/work/PROJECTS/RESTORBIOME2/ref/NC_012781_eubacterium_rectale_spikes.fasta prok --out_dir DBCAN_ER --db_dir /db/outils/dbcan/ --tools all --hmm_cpu 16 --eCAMI_jobs 16 --cluster NC_012781_eubacterium_rectale_spikes_cazymes.gff3
python ~/save/SCRIPTS/add_dbcan_to_gff3.py --gff NC_012781_eubacterium_rectale_spikes_cazymes.gff3 --overview DBCAN_ER/overview.txt --gff-prodigal DBCAN_ER/prodigal.gff --output NC_012781_eubacterium_rectale_spikes_cazymes_dbcan.gff3 --cgc-file DBCAN_ER/cgc_standard.out > ER_dbcan.log
cd /home/orue/work/PROJECTS/RESTORBIOME2
pip install biopython
pip install gff3
# Add pulPred tag to GFF3 for BU only
python add_pul_to_gff3.py --gff CP102263_1_Bacteroides_uniformis_spikes_cazymes_dbcan.gff3 --pul CAZYMES/BuATCC8492_PULs.txt --output CP102263_1_Bacteroides_uniformis_spikes_cazymes_dbcan_pulpred.gff3 --tag pulPred
# Add CC tag to GFF3 for BU only
python add_pul_to_gff3.py --gff CP102263_1_Bacteroides_uniformis_spikes_cazymes_dbcan_pulpred.gff3 --pul CAZYMES/BuATCC8492_CC.txt --output CP102263_1_Bacteroides_uniformis_spikes_cazymes_dbcan_pulpred_cc.gff3 --tag CC
We first setup our environment by loading a few packages.
library(tidyverse)
library(kableExtra) ## table visualisation
#library(rtracklayer) ## for annotation
library(EnhancedVolcano) ## for volcano plot
We performed differential analyses for each experiment on bacteria:
The comparaisons of interest concerned Raffinose (R) treatment versus control treatment as Glucose (G). The lists of differential genes were produced using edgeR
Reports on normalisation step for each experiment on bacteria
Reports on differential analysis step for each experiment on bacteria
Excel files with 3 sheets (complete, up, down) for each experiment on bacteria
Volcano plots were performed with EnhancedVolcano
Reports on Volcano visualisation for each experiment on bacteria
Excel files with 3 sheets (complete, pul, cazy) for each experiment on bacteria
A work by Migale Bioinformatics Facility
Université Paris-Saclay, INRAE, MaIAGE, 78350, Jouy-en-Josas, France
Université Paris-Saclay, INRAE, BioinfOmics, MIGALE bioinformatics facility, 78350, Jouy-en-Josas, France