cd /work_home/cmidoux/BAHIA
RABATGA
The aim of this project is to characterise a Moroccan bacterial strain
This document is a report of the analyses performed. You will find all the code used to analyze these data. The version of the tools (maybe in code chunks) and their references are indicated, for questions of reproducibility.
Aim of the project
The aim of this project is to characterise a Moroccan bacterial strain
Patners
- Cédric Midoux - Migale bioinformatics facility - BioInfomics - INRAE
- Valentin Loux - Migale bioinformatics facility - BioInfomics - INRAE
- Bahia Rached - CNRST Rabat (Maroc)
- Christel Maillet - MICALIS - INRAE
Deliverables
Deliverables agreed at the preliminary meeting (Table 1).
Definition | |
---|---|
1 | HTML report |
2 | FASTA Sequences |
Data management
All data is managed by the migale facility for the duration of the project. Once the project is over, the Migale facility does not keep your data. We will provide you with the raw data and associated metadata that will be deposited on public repositories before the results are used. We can guide you in the submission process. We will then decide which files to keep, knowing that this report will also be provided to you and that the analyses can be replayed if needed.
Raw data
Raw data (Illumina + IonTorrent) were sequenced by CNRST. Files were sent and deposited on the front server
.
IonTorrent data were cut because the file was incomplete. We keep the complete reads.
conda activate seqkit-2.0.0
seqkit seq -m 1 DATA/B599_iontorrent.fastq -o DATA/B599_iontorrent_complet.fastq
conda deactivate
Analyses
Quality control
mkdir 0_FASTP/
conda activate fastp-0.23.4
fastp --in1 DATA/B599_subsampled_R1.fastq --in2 DATA/B599_subsampled_R2.fastq --out1 0_FASTP/B599_subsampled_R1.fastq.gz --out2 0_FASTP/B599_subsampled_R2.fastq.gz --length_required 50 --html 0_FASTP/B599_subsampled_fastp.html --json 0_FASTP/B599_subsampled_fastp.json --thread 4
fastp --in1 DATA/B599_iontorrent_complet.fastq --out1 0_FASTP/B599_iontorrent.fastq.gz --length_required 50 --html 0_FASTP/B599_iontorrent.fastp.html --json 0_FASTP/B599_iontorrent.json --thread 4
conda deactivate
conda activate multiqc-1.21
multiqc --outdir 00_MULTIQC 0_FASTP
conda deactivate
The MultiQC report shows metrics before and after filtering.
Taxonomic affiliation
mkdir 1_KAIJU
qsub -cwd -V -N kaiju_illumina -pe thread 16 -e LOGS -o LOGS -b y "conda activate kaiju-1.9.2 && kaiju -t /db/outils/kaiju-2023-05/refseq/nodes.dmp -f /db/outils/kaiju-2023-05/refseq/kaiju_db_refseq.fmi -i 0_FASTP/B599_subsampled_R1.fastq.gz -j 0_FASTP/B599_subsampled_R2.fastq.gz -o 1_KAIJU/B599_illumina.kaiju -z 16 && kaiju2krona -t /db/outils/kaiju-2023-05/refseq/nodes.dmp -n /db/outils/kaiju-2023-05/refseq/names.dmp -i 1_KAIJU/B599_illumina.kaiju -o 1_KAIJU/B599_illumina.krona -u && conda deactivate"
qsub -cwd -V -N kaiju_iontorrent -pe thread 16 -e LOGS -o LOGS -b y "conda activate kaiju-1.9.2 && kaiju -t /db/outils/kaiju-2023-05/refseq/nodes.dmp -f /db/outils/kaiju-2023-05/refseq/kaiju_db_refseq.fmi -i 0_FASTP/B599_iontorrent.fastq.gz -o 1_KAIJU/B599_iontorrent.kaiju -z 16 && kaiju2krona -t /db/outils/kaiju-2023-05/refseq/nodes.dmp -n /db/outils/kaiju-2023-05/refseq/names.dmp -i 1_KAIJU/B599_iontorrent.kaiju -o 1_KAIJU/B599_iontorrent.krona -u && conda deactivate"
qsub -cwd -V -N kaiju_iontorrent_nreuk -pe thread 16 -e LOGS -o LOGS -b y "conda activate kaiju-1.9.2 && kaiju -t /db/outils/kaiju-2023-05/nr_euk/nodes.dmp -f /db/outils/kaiju-2023-05/nr_euk/kaiju_db_nr_euk.fmi -i 0_FASTP/B599_iontorrent.fastq.gz -o 1_KAIJU/B599_iontorrent_nreuk.kaiju -z 16 && kaiju2krona -t /db/outils/kaiju-2023-05/nr_euk/nodes.dmp -n /db/outils/kaiju-2023-05/nr_euk/names.dmp -i 1_KAIJU/B599_iontorrent_nreuk.kaiju -o 1_KAIJU/B599_iontorrent_nreuk.krona -u && conda deactivate"
conda activate krona-2.8
ktImportText -o 1_KAIJU/B599-krona.html 1_KAIJU/B599_illumina.krona 1_KAIJU/B599_iontorrent.krona 1_KAIJU/B599_iontorrent_nreuk.krona
conda deactivate
The KAIJU report shows the taxonomic distribution of reads.
Assembly
qsub -cwd -V -N spades -q maiage.q -pe thread 16 -e LOGS -o LOGS -b y "conda activate spades-3.15.3 && spades.py --isolate -t 16 -m 500 --tmp-dir /projet/tmp/ -1 0_FASTP/B599_subsampled_R1.fastq.gz -2 0_FASTP/B599_subsampled_R2.fastq.gz -s 0_FASTP/B599_iontorrent.fastq.gz -o 2_SPADES && conda deactivate"
We also used unicycler
mkdir 3_UNICYCLER
qsub -cwd -V -N unicycler -q maiage.q -pe thread 16 -e LOGS -o LOGS -b y "conda activate unicycler-0.5.0 && unicycler -1 0_FASTP/B599_subsampled_R1.fastq.gz -2 0_FASTP/B599_subsampled_R2.fastq.gz -s 0_FASTP/B599_iontorrent.fastq.gz -o 3_UNICYCLER -t 16 && conda deactivate"
qsub -cwd -V -N unicycler -q maiage.q -pe thread 16 -e LOGS -o LOGS -b y "conda activate unicycler-0.5.0 && unicycler -1 0_FASTP/B599_subsampled_R1.fastq.gz -2 0_FASTP/B599_subsampled_R2.fastq.gz -o 3_UNICYCLER_illumina -t 16 && conda deactivate"
qsub -cwd -V -N unicycler -q maiage.q -pe thread 16 -e LOGS -o LOGS -b y "conda activate unicycler-0.5.0 && unicycler -s 0_FASTP/B599_iontorrent.fastq.gz -o 3_UNICYCLER_iontorrent -t 16 && conda deactivate"
Quast
mkdir 4_QUAST
conda activate quast-5.2.0
quast --gene-finding -o 4_QUAST -1 0_FASTP/B599_subsampled_R1.fastq.gz -2 0_FASTP/B599_subsampled_R2.fastq.gz -l "spades, unicycler, unicycler_illumina, unicycler_iontorrent" --threads 4 2_SPADES/contigs.fasta 3_UNICYCLER/assembly.fasta 3_UNICYCLER_illumina/assembly.fasta 3_UNICYCLER_iontorrent/assembly.fasta