Data quality of whole genome bisulfite sequencing on Illumina platforms.

Raine A, Liljedahl U, Nordlund J

PLoS ONE 13 (4) e0195972 [2018-04-18; online 2018-04-18]

The powerful HiSeq X sequencers with their patterned flowcell technology and fast turnaround times are instrumental for many large-scale genomic and epigenomic studies. However, assessment of DNA methylation by sodium bisulfite treatment results in sequencing libraries of low diversity, which may impact data quality and yield. In this report we assess the quality of WGBS data generated on the HiSeq X system in comparison with data generated on the HiSeq 2500 system and the newly released NovaSeq system. We report a systematic issue with low basecall quality scores assigned to guanines in the second read of WGBS when using certain Real Time Analysis (RTA) software versions on the HiSeq X sequencer, reminiscent of an issue that was previously reported with certain HiSeq 2500 software versions. However, with the HD.3.4.0 /RTA 2.7.7 software upgrade for the HiSeq X system, we observed an overall improved quality and yield of the WGBS data generated, which in turn empowers cost-effective and high quality DNA methylation studies.

NGI Uppsala (SNP&SEQ Technology Platform) [Technology development]

QC bibliography QC xrefs

PubMed 29668744

DOI 10.1371/journal.pone.0195972

Crossref 10.1371/journal.pone.0195972

PONE-D-17-37428

pmc PMC5905984

BioProject PRJNA407440 [raw sequence reads]

GEO GSE89213 [Splinted Ligation Adapter Tagging, a novel library preparation for whole genome bisulphite sequencing]