Reinshagen J, Seashore-Ludlow B, Gadiya Y, Gustavsson AL, Tanoli Z, Aittokallio T, Huchting J, Jenmalm-Jensen A, Gribbon P, Zaliani A, Ballante F
Database (Oxford) 2025 (-) - [2025-01-18; online 2025-12-09]
In the rapidly advancing landscape of drug discovery and repurposing, efficient access and integration of chemical and bioactivity data from public repositories have become essential. To address this need, we developed two complementary annotation pipelines (KNIME- and Python-based) that automate the extraction and integration of curated chemical and bioactivity data from public repositories. These pipelines support any user-provided compound library, enabling reproducible workflows that integrate data from heterogeneous sources such as ChEMBL and PubChem. As part of the REMEDi4ALL project, with the aim of establishing a European platform for drug repurposing, we validated our framework using a harmonized subset of the Specs repurposing collection, which includes >5000 compounds available at the partner institutes. We also developed two interactive dashboards that support multilayered analyses and visualization by integrating chemical properties, bioactivity profiles, and relational data. Our results demonstrate that this framework streamlines the collection of harmonized data and facilitates analyses that are critical for drug repurposing efforts, while remaining versatile for broader applications in drug discovery. Moreover, the analysis of the annotations reveals that the Specs subset includes chemical scaffolds representative of a significant portion of approved drugs and compounds undergoing clinical evaluation, underscoring its potential as a rich source of drug repurposing candidates. Both pipeline protocols are publicly available online, and the dashboards are open access.
Chemical Biology Consortium Sweden [Technology development]
PubMed 41364066
DOI 10.1093/database/baaf081
Crossref 10.1093/database/baaf081
pmc: PMC12687465
pii: 8374761