Mol Ecol Resour 21 (3) 880-896 [2021-04-00; online 2020-12-02]
Norway spruce (Picea abies L. Karst) is one of the most important forest tree species with significant economic and ecological impact in Europe. For decades, genomic and genetic studies on Norway spruce have been challenging due to the large and repetitive genome (19.6 Gb with more than 70% being repetitive). To accelerate genomic studies, including population genetics, genome-wide association studies (GWAS) and genomic selection (GS), in Norway spruce and related species, we here report on the design and performance of a 50K single nucleotide polymorphism (SNP) genotyping array for Norway spruce. The array is developed based on whole genome resequencing (WGS), making it the first WGS-based SNP array in any conifer species so far. After identifying SNPs using genome resequencing data from 29 trees collected in northern Europe, we adopted a two-step approach to design the array. First, we built a 450K screening array and used this to genotype a population of 480 trees sampled from both natural and breeding populations across the Norway spruce distribution range. These samples were then used to select high-confidence probes that were put on the final 50K array. The SNPs selected are distributed over 45,552 scaffolds from the P. abies version 1.0 genome assembly and target 19,954 unique gene models with an even coverage of the 12 linkage groups in Norway spruce. We show that the array has a 99.5% probe specificity, >98% Mendelian allelic inheritance concordance, an average sample call rate of 96.30% and an SNP call rate of 98.90% in family trios and haploid tissues. We also observed that 23,797 probes (50%) could be identified with high confidence in three other spruce species (white spruce [Picea glauca], black spruce [P. mariana] and Sitka spruce [P. sitchensis]). The high-quality genotyping array will be a valuable resource for genetic and genomic studies in Norway spruce as well as in other conifer species of the same genus.