Linked-read sequencing enables haplotype-resolved resequencing at population scale.

Lutgen D, Ritter R, Olsen RA, Schielzeth H, Gruselius J, Ewels P, García JT, Shirihai H, Schweizer M, Suh A, Burri R

Mol Ecol Resour 20 (5) 1311-1322 [2020-09-00; online 2020-06-29]

The feasibility to sequence entire genomes of virtually any organism provides unprecedented insights into the evolutionary history of populations and species. Nevertheless, many population genomic inferences - including the quantification and dating of admixture, introgression and demographic events, and inference of selective sweeps - are still limited by the lack of high-quality haplotype information. The newest generation of sequencing technology now promises significant progress. To establish the feasibility of haplotype-resolved genome resequencing at population scale, we investigated properties of linked-read sequencing data of songbirds of the genus Oenanthe across a range of sequencing depths. Our results based on the comparison of downsampled (25×, 20×, 15×, 10×, 7×, and 5×) with high-coverage data (46-68×) of seven bird genomes mapped to a reference suggest that phasing contiguities and accuracies adequate for most population genomic analyses can be reached already with moderate sequencing effort. At 15× coverage, phased haplotypes span about 90% of the genome assembly, with 50% and 90% of phased sequences located in phase blocks longer than 1.25-4.6 Mb (N50) and 0.27-0.72 Mb (N90). Phasing accuracy reaches beyond 99% starting from 15× coverage. Higher coverages yielded higher contiguities (up to about 7 Mb/1 Mb [N50/N90] at 25× coverage), but only marginally improved phasing accuracy. Phase block contiguity improved with input DNA molecule length; thus, higher-quality DNA may help keeping sequencing costs at bay. In conclusion, even for organisms with gigabase-sized genomes like birds, linked-read sequencing at moderate depth opens an affordable avenue towards haplotype-resolved genome resequencing at population scale.

Bioinformatics Support for Computational Resources [Service]

NGI Stockholm (Genomics Applications) [Collaborative]

NGI Stockholm (Genomics Production) [Collaborative]

National Genomics Infrastructure [Collaborative]

PubMed 32419391

DOI 10.1111/1755-0998.13192

Crossref 10.1111/1755-0998.13192


Publications 9.5.1