Knowledge-based Reconstruction of mRNA Transcripts with Short Sequencing Reads for Transcriptome Research

INTRODUCTION

The knowledge-based transcript reconstruction algorithm was developed to aid the design and analysis of next-generation transcriptome arrays, using RNA-Seq data in NCBI SRA (1) and annotation databases of RNA transcripts, such as RefSeq, Ensembl and UCSC known genes. The algorithm combines information in the RNA-Seq data and in the reference annotations to define a set of candidate exons, junctions and transcripts for array design and analysis.

USAGE

SpliceMap : please refer to here for details (2).

ExonMap: Rscript ExonMap.R NewJuc.bed KnownJuc.bed KnownExon.bed [MaxExonLen]

where NewJuc.bed is the output bed file of SpliceMap. KnownJuc.bed and KnownExon.bed are reference junctions and exons derived from known transcript annotation. A simple way to get them is using theUCSC table browser. The ExonMap program requires the installation of Rscript (part of R). For most recent R package, please download it from here.

JunctionWalk: discover_txs [nreads(20) max_exon_length(1000)] new.junction.bed gene.annot.txt > new.annot.txt

new.junction.bed is the output of SpliceMap. gene.annot.txt is the reference transcript annotation in the bed file format.

Download

SpliceMap

ExonMap

JunctionWalk

Data Availability

The RNA-Seq data was previously published in (3) and deposited into NCBI SRA under accession # GSE26109. For more information, please contact .

wxiao1@parters.org

Reference

1. Leinonen R, Sugawara H, Shumway M. The sequence read archive. Nucleic Acids Res. 2011 Jan;39(Database issue):D19-21.

2. Au KF, Jiang H, Lin L, Xing Y, Wong WH. Detection of splice junctions from paired-end RNA-seq data by SpliceMap. Nucleic Acids Res. 2010 Apr 5.

3. Xu W, Seok J, Mindrinos MN, Schweitzer AC, Jiang H, Wilhelmy J, et al. Human transcriptome array for high-throughput clinical studies. Proc Natl Acad Sci U S A. 2011 Mar 1;108(9):3707-12.

Download

Data Availability

Reference

Leave a Comment Cancel Reply

INNOVATION CENTER OF COMPUTATIONAL HEALTH