Skip to end of metadata
Go to start of metadata
NameVersionResource
SRAtoolkit2.3Tusker
SRAtoolkit2.8Tusker
NameVersionResource
SRAtoolkit2.8Crane


SRA (Sequence Read Archive) (http://www.ncbi.nlm.nih.gov/sra) is an NCBI-defined format for NGS data. Every data submitted to NCBI needs to be in SRA format. The SRA Toolkit provides tools for converting different formats of data into SRA format, and vice versa, extracting SRA data in other different formats.

The SRA Toolkit allows converting data from the SRA format to the following formats: ABI SOLiD native, fasta, fastq, sff, sam, and Illumina native. Also, the SRA Toolkit allows converting data from: fasta, fastq, AB SOLiD-SRF, AB SOLiD-native, Illumina SRF, Illumina native, sff, and bam format into the SRA format.

The SRA Toolkit contains multiple "format"-dump commands, where "format" is the file format the SRA data is converted to: abi-dumpfastq-dumpillumina-dumpsam-dumpsff-dump, and vdb-dump.

One of the most commonly used commands is fastq-dump:

General Fastq-Dump Usage
fastq-dump [options] input_reads.sra 

An example of running fastq-dump on Tusker to convert SRA file containing paired-end reads is:

sratoolkit.submit

#!/bin/sh
#SBATCH --job-name=SRAtoolkit
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --time=168:00:00
#SBATCH --mem=50gb
#SBATCH --output=SRAtoolkit.%J.out
#SBATCH --error=SRAtoolkit.%J.err

 

module load SRAtoolkit/2.8

fastq-dump --split-files input_reads.sra

This script outputs two fastq paired end reads: input_reads_1.fastq and input_reads_2.fastq

All SRAtoolkit commands are single threaded, and therefore both #SBATCH --nodes and #SBATCH --ntasks-per-node in the SLURM script are set to 1.

The SRA Toolkit contains multiple "format"-load commands, where "format" is the file format of the data that is uploaded to NCBI: srf-loadsff-loadrefseq-loadpacbio-loadillumina-loadhelicos-loadfastq-loadcg-loadbam-load, and abi-load. An example of bam file input_alignments.bam uploaded to NCBI is shown below:

General Bam-Load Usage
bam-load \-o input_reads.sra input_alignments.bam

 

Other frequently used SRAtoolkit tools are:

  • No labels