Hierarchical molecular tagging to resolve long continuous sequences by massively parallel sequencing.

Lundin S, Gruselius J, Nystedt B, Lexow P, Käller M, Lundeberg J

Sci Rep 3 (-) 1186 [2013-03-09; online 2013-03-09]

Here we demonstrate the use of short-read massive sequencing systems to in effect achieve longer read lengths through hierarchical molecular tagging. We show how indexed and PCR-amplified targeted libraries are degraded, sub-sampled and arrested at timed intervals to achieve pools of differing average length, each of which is indexed with a new tag. By this process, indices of sample origin, molecular origin, and degree of degradation is incorporated in order to achieve a nested hierarchical structure, later to be utilized in the data processing to order the reads over a longer distance than the sequencing system originally allows. With this protocol we show how continuous regions beyond 3000 bp can be decoded by an Illumina sequencing system, and we illustrate the potential applications by calling variants of the lambda genome, analysing TP53 in cancer cell lines, and targeting a variable canine mitochondrial region.

NGI Stockholm (Genomics Applications)

NGI Stockholm (Genomics Production)

National Genomics Infrastructure

PubMed 23470464

DOI 10.1038/srep01186

Crossref 10.1038/srep01186

pii: srep01186
pmc: PMC3592332