Visualization and analysis of medically relevant tandem repeats in nanopore sequencing of control cohorts with pathSTR.

De Coster W, Hoijer I, Bruggeman I, D'Hert S, Melin M, Ameur A, Rademakers R

Genome Res. - (-) - [2024-08-15; online 2024-08-15]

The lack of population-scale databases hampers research and diagnostics for medically relevant tandem repeats and repeat expansions. We attempt to fill this gap using our pathSTR web tool, which leverages long-read sequencing of large cohorts to determine repeat length and sequence composition in a healthy population. The current version includes 1040 individuals of the 1000 Genomes Project cohort sequenced on the Oxford Nanopore Technologies PromethION. A comprehensive set of medically relevant tandem repeats was genotyped using STRdust and LongTR to determine the tandem repeat length and sequence composition. PathSTR provides rich visualizations of this dataset and the feature to upload one's data for comparison along the control cohort. We demonstrate the implementation of this application using data from targeted nanopore sequencing of a patient with Myotonic Dystrophy type 1. This resource will empower the genetics community to get a more complete overview of normal variation in tandem repeat length and sequence composition and, as such, enable a better assessment of rare tandem repeat alleles observed in patients.

NGI Long read [Technology development]

NGI Uppsala (Uppsala Genome Center) [Technology development]

National Genomics Infrastructure [Technology development]

PubMed 39147583

DOI 10.1101/gr.279265.124

Crossref 10.1101/gr.279265.124

pii: gr.279265.124


Publications 9.5.1