INTRODUCTION
The knowledge-based transcript reconstruction algorithm was developed to aid the design and analysis of next-generation transcriptome arrays, using RNA-Seq data in NCBI SRA (1) and annotation databases of RNA transcripts, such as RefSeq, Ensembl and UCSC known genes. The algorithm combines information in the RNA-Seq data and in the reference annotations to define a set of candidate exons, junctions and transcripts for array design and analysis.
USAGE
SpliceMap : please refer to here for details (2).
ExonMap: Rscript ExonMap.R NewJuc.bed KnownJuc.bed KnownExon.bed [MaxExonLen]
where NewJuc.bed is the output bed file of SpliceMap. KnownJuc.bed and KnownExon.bed are reference junctions and exons derived from known transcript annotation. A simple way to get them is using theUCSC table browser. The ExonMap program requires the installation of Rscript (part of R). For most recent R package, please download it from here.
JunctionWalk: discover_txs [nreads(20) max_exon_length(1000)] new.junction.bed gene.annot.txt > new.annot.txt
new.junction.bed is the output of SpliceMap. gene.annot.txt is the reference transcript annotation in the bed file format.
Download
Data Availability
The RNA-Seq data was previously published in (3) and deposited into NCBI SRA under accession # GSE26109. For more information, please contact .
wxiao1@parters.org
Reference
1. Leinonen R, Sugawara H, Shumway M. The sequence read archive. Nucleic Acids Res. 2011 Jan;39(Database issue):D19-21.
2. Au KF, Jiang H, Lin L, Xing Y, Wong WH. Detection of splice junctions from paired-end RNA-seq data by SpliceMap. Nucleic Acids Res. 2010 Apr 5.
3. Xu W, Seok J, Mindrinos MN, Schweitzer AC, Jiang H, Wilhelmy J, et al. Human transcriptome array for high-throughput clinical studies. Proc Natl Acad Sci U S A. 2011 Mar 1;108(9):3707-12.