A novel canine reference genome resolves genomic architecture and uncovers transcript complexity.

Wang C, Wallerman O, Arendt ML, Sundström E, Karlsson Å, Nordin J, Mäkeläinen S, Pielberg GR, Hanson J, Ohlsson Å, Saellström S, Rönnberg H, Ljungvall I, Häggström J, Bergström TF, Hedhammar Å, Meadows JRS, Lindblad-Toh K

Commun Biol 4 (1) 185 [2021-02-10; online 2021-02-10]

We present GSD_1.0, a high-quality domestic dog reference genome with chromosome length scaffolds and contiguity increased 55-fold over CanFam3.1. Annotation with generated and existing long and short read RNA-seq, miRNA-seq and ATAC-seq, revealed that 32.1% of lifted over CanFam3.1 gaps harboured previously hidden functional elements, including promoters, genes and miRNAs in GSD_1.0. A catalogue of canine "dark" regions was made to facilitate mapping rescue. Alignment in these regions is difficult, but we demonstrate that they harbour trait-associated variation. Key genomic regions were completed, including the Dog Leucocyte Antigen (DLA), T Cell Receptor (TCR) and 366 COSMIC cancer genes. 10x linked-read sequencing of 27 dogs (19 breeds) uncovered 22.1 million SNPs, indels and larger structural variants. Subsequent intersection with protein coding genes showed that 1.4% of these could directly influence gene products, and so provide a source of normal or aberrant phenotypic modifications.

Bioinformatics Support for Computational Resources [Service]

NGI Uppsala (SNP&SEQ Technology Platform) [Service]

NGI Uppsala (Uppsala Genome Center) [Service]

National Genomics Infrastructure [Service]

PubMed 33568770

DOI 10.1038/s42003-021-01698-x

Crossref 10.1038/s42003-021-01698-x

pii: 10.1038/s42003-021-01698-x
pmc: PMC7875987

Publications 9.5.0