Skip to end of metadata
Go to start of metadata
NameVersionResource
cufflinks2.0.2Tusker
cufflinks2.1.1Tusker
cufflinks2.2Tusker
cufflinks2.1Crane
cufflinks2.2Crane

 

Cufflinks (http://cufflinks.cbcb.umd.edu/) is a transcript assembly program that includes a number of tools for analyzing RNA-Seq data. These tools assemble aligned RNA-Seq reads into transcripts, estimate their abundances, test for differential expression and regulation transcriptome-wide, and provide transcript quantification. Some of the tools part of Cufflinks can be run individually, while other are part of a larger workflow.

The basic usage of Cufflinks is: 

General Cufflinks Usage
cufflinks [options] input_alignments.[sam|bam]

where input_alignments.[sam|bam] is sorted input file of RNA-Seq read alignments in SAM/BAM format. The RNA-Seq read mapper TopHat/TopHat2 produces output in this format and is recommended to be used with Cufflinks, although SAM/BAM alignments produced from any aligner are accepted. More advanced Cufflinks options can be found in the manual: http://cufflinks.cbcb.umd.edu/manual.html, or by typing:

Additional Cufflinks Options
[<username>@login.tusker ~]$ cufflinks -h

An example of how to run Cufflinks on Tusker with alignment file in SAM format, output directory cufflinks_output/ and 8 CPUs is shown below:

cufflinks.submit

#!/bin/sh
#SBATCH --job-name=Cufflinks
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=8
#SBATCH --time=168:00:00
#SBATCH --mem=50gb
#SBATCH --output=Cufflinks.%J.out
#SBATCH --error=Cufflinks.%J.err

 

module load cufflinks/2.2

cufflinks input_alignments.sam -o cufflinks_output/ -p $SLURM_NTASKS_PER_NODE

 

The program cufflinks produces number of files in its predefined output directory cufflinks_output/. Some of the generated files are:

  • transcripts.gtf: The GTF file contains Cufflinks' assembled isoforms where there is one GTF record per row, and each record represents either a transcript or an exon within a transcript
  • isoforms.fpkm_tracking: This file contains the estimated isoform-level expression values in the generic FPKM Tracking Format
  • genes.fpkm_tracking: This file contains the estimated gene-level expression values in the generic FPKM Tracking Format

 

Beside cufflinks, the Cufflinks package includes the following programs:

  • Cuffcompare

Cuffcompare uses the Cufflinks' GTF output as an input file and compares the assembled transcripts to a reference annotation. An example of comparing the already annotated genome known_annotation.gtf with the new annotation new_annotation.gtf follows:

General Cuffcompare Usage
cuffcompare -r known_annotation.gtf new_annotation.gtf

This tool reports various statistics about the transcripts, as well as a GTF file containing all transfrags in each sample.

  • Cuffmerge

This program allows merging of multiple Cufflinks GTF files. An example of merging multiple GTF files with full paths defined in the file list_GTF.txt and 8 CPUs is shown below:

General Cuffmerge Usage
cuffmerge list_GTF.txt -p 8

The cuffmerge output is single unified transcript file.

  • Cuffdiff

Cuffdiff is used to identify differentially expressed transcripts. An example of cuffdiff for the annotated transcripts for the new genome, new_annotations.gtf, with 3 SAM alignment files generated from TopHat and 8 CPUs follows:

General Cuffdiff Usage
cuffdiff new_alignments.gtf sample_1.sam, sample_2.sam, sample_3.sam -p 8

Cuffdiff prints multiple output files, such as: FPKM tracking files, count tracking files, read group tracking files, differential expression tests, differential splicing tests, differential coding output, differential promoter use, read group info, and run info.

  • No labels