Nordin A, Pagella P, Zambanini G, Cantù C
Nucleic Acids Res. 52 (7) e40 [2024-04-24; online 2024-03-19]
Genome-wide binding assays aspire to map the complete binding pattern of gene regulators. Common practice relies on replication-duplicates or triplicates-and high stringency statistics to favor false negatives over false positives. Here we show that duplicates and triplicates of CUT&RUN are not sufficient to discover the entire activity of transcriptional regulators. We introduce ICEBERG (Increased Capture of Enrichment By Exhaustive Replicate aGgregation), a pipeline that harnesses large numbers of CUT&RUN replicates to discover the full set of binding events and chart the line between false positives and false negatives. We employed ICEBERG to map the full set of H3K4me3-marked regions, the targets of the co-factor β-catenin, and those of the transcription factor TBX3, in human colorectal cancer cells. The ICEBERG datasets allow benchmarking of individual replicates, comparing the performance of peak calling and replication approaches, and expose the arbitrary nature of strategies to identify reproducible peaks. Instead of a static view of genomic targets, ICEBERG establishes a spectrum of detection probabilities across the genome for a given factor, underlying the intrinsic dynamicity of its mechanism of action, and permitting to distinguish frequent from rare regulation events. Finally, ICEBERG discovered instances, undetectable with other approaches, that underlie novel mechanisms of colorectal cancer progression.
Clinical Genomics Linköping [Service]
PubMed 38499482
DOI 10.1093/nar/gkae180
Crossref 10.1093/nar/gkae180
pmc: PMC11040144
pii: 7631395