Sigurgeirsson B, Emanuelsson O, Lundeberg J
BMC Genomics 15 (-) 631 [2014-07-28; online 2014-07-28]
Strand specific RNA sequencing is rapidly replacing conventional cDNA sequencing as an approach for assessing information about the transcriptome. Alongside improved laboratory protocols the development of bioinformatical tools is steadily progressing. In the current procedure the Illumina TruSeq library preparation kit is used, along with additional reagents, to make stranded libraries in an automated fashion which are then sequenced on Illumina HiSeq 2000. By the use of freely available bioinformatical tools we show, through quality metrics, that the protocol is robust and reproducible. We further highlight the practicality of strand specific libraries by comparing expression of strand specific libraries to non-stranded libraries, by looking at known antisense transcription of pseudogenes and by identifying novel transcription. Furthermore, two ribosomal depletion kits, RiboMinus and RiboZero, are compared and two sequence aligners, Tophat2 and STAR, are also compared. The, non-stranded, Illumina TruSeq kit can be adapted to generate strand specific libraries and can be used to access detailed information on the transcriptome. The RiboZero kit is very effective in removing ribosomal RNA from total RNA and the STAR aligner produces high mapping yield in a short time. Strand specific data gives more detailed and correct results than does non-stranded data as we show when estimating expression values and in assembling transcripts. Even well annotated genomes need improvements and corrections which can be achieved using strand specific data. Researchers in the field should strive to use strand specific data; it allows for more confidence in the data analysis and is less likely to lead to false conclusions. If faced with analysing non-stranded data, researchers should be well aware of the caveats of that approach.
QC bibliography QC xrefs