An Interpretable Machine Learning Framework for Adverse Drug Reaction Prediction from Drug-target Interactions
Joseph Roberts-Nuttall, Alan M. Jones, Marco Castellani, Duc Pham
Abstract
Adverse drug reactions (ADRs) present challenges to patient safety and healthcare systems. Current pharmacovigilance methods, such as the Yellow Card Scheme (YCS), provide valuable post-marketing data, but the mechanistic causes of these ADRs are not fully understood. Leveraging drug-target interaction data with interpretable machine learning offers a promising approach to anticipate ADRs and understand their underlying mechanisms.
Introduction
Adverse drug reactions (ADRs) are unwanted or harmful reactions that occur when a drug is administered correctly, at the recommended dose, to the appropriate patient, and for its intended purpose [1]. ADRs pose a significant challenge to healthcare, affecting patient health, quality of life (QoL), and generating a substantial health economic strain. In the UK, ADRs account for ~16.5% of all in-patient hospital admissions and cost the NHS ~ £2.2 billion annually. ADRs are commonly classified using the Rawlins-Thompson classification system into Type A and Type B.
Materials and Methods:
Drug-target interaction data was obtained from STITCH v5.0 (accessed [16th Jan 2025]). Data files were downloaded for human interactions and chemical identifiers, providing the information for interactions between drugs and human targets. Each interaction is associated with a confidence score ranging from 0 (no confidence) to 1 (high confidence); interactions with missing data or no supporting evidence were conservatively assigned a score of 0.
Discussion:
The comparative analysis between the YCS and SIDER databases reveals a striking divergence in ADR signals. Only 3.3% of significant ADRs cross-over, compared to 7.0% and 8.3% uniquely found in the YCS and SIDER. The low Jaccard index (17.6%) further reinforces the minimal overlap between clinical trials and real-world datasets. This highlights how the sources capture different ADR profiles, consistent with prior studies’ findings. These differences stem from methodologies, populations, and data collection contexts, which are explored below.
Acknowledgments
The research described in this paper was carried out with assistance from the Research Software Group, part of Advanced Research Computing at the University of Birmingham. See https://www.birmingham.ac.uk/bear-software for more details.
Citation: Roberts-Nuttall J, Jones AM, Castellani M, Pham D (2026) An interpretable machine learning framework for adverse drug reaction prediction from drug-target interactions. PLoS One 21(1): e0340900. https://doi.org/10.1371/journal.pone.0340900
Editor: Ali Awadallah Saeed, National University, SUDAN
Received: September 3, 2025; Accepted: December 28, 2025; Published: January 30, 2026.
Copyright: © 2026 Roberts-Nuttall et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All data and code used in this study are publicly available and on GitHub at: https://github.com/Joeroberts1601/Random_Forest_ADR_Prediction All relevant data are within the paper and its Supporting Information files.
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.