Long-read sequencing and optical mapping generates near T2T assemblies that resolves a centromeric translocation.

Ten Berk de Boer E, Ameur A, Bunikis I, Ek M, Stattin EL, Feuk L, Eisfeldt J, Lindstrand A

Sci Rep 14 (1) 9000 [2024-04-18; online 2024-04-18]

Long-read genome sequencing (lrGS) is a promising method in genetic diagnostics. Here we investigate the potential of lrGS to detect a disease-associated chromosomal translocation between 17p13 and the 19 centromere. We constructed two sets of phased and non-phased de novo assemblies; (i) based on lrGS only and (ii) hybrid assemblies combining lrGS with optical mapping using lrGS reads with a median coverage of 34X. Variant calling detected both structural variants (SVs) and small variants and the accuracy of the small variant calling was compared with those called with short-read genome sequencing (srGS). The de novo and hybrid assemblies had high quality and contiguity with N50 of 62.85 Mb, enabling a near telomere to telomere assembly with less than a 100 contigs per haplotype. Notably, we successfully identified the centromeric breakpoint of the translocation. A concordance of 92% was observed when comparing small variant calling between srGS and lrGS. In summary, our findings underscore the remarkable potential of lrGS as a comprehensive and accurate solution for the analysis of SVs and small variants. Thus, lrGS could replace a large battery of genetic tests that were used for the diagnosis of a single symptomatic translocation carrier, highlighting the potential of lrGS in the realm of digital karyotyping.

NGI Short read [Service]

NGI Stockholm (Genomics Production) [Service]

National Genomics Infrastructure [Service]

PubMed 38637641

DOI 10.1038/s41598-024-59683-3

Crossref 10.1038/s41598-024-59683-3

pmc: PMC11026446
pii: 10.1038/s41598-024-59683-3

Publications 9.5.0