Exploring Artificial intelligence (AI) Potential in Pharma for Drug Discovery and Development
Kasturi Pawar, Director, Formulation Development, Exelixis, Inc
Artificial intelligence (AI) is transforming pharmaceutical research by accelerating drug discovery, improving development efficiency, and enabling precision medicine. Applications span target identification, de novo molecular design, ADMET prediction, clinical trial optimization, manufacturing control, and post-market surveillance. Despite remarkable potential, challenges remain in data quality, transparency, and regulatory integration. This article reviews key AI-driven strategies, case studies, and future perspectives in pharmaceutical innovation.

Drug discovery and development are complex, costly, and time-intensive processes, with average timelines exceeding a decade and costs surpassing billions of dollars. High attrition rates, particularly during clinical trials underscore the need for innovative approaches. Artificial intelligence (AI), leveraging machine learning (ML), natural language processing (NLP), and deep learning, have emerged as a disruptive tool in biopharma. By integrating vast biomedical, chemical, and clinical datasets, AI can enhance efficiency, reduce failures, and support personalized medicine.
AI in Early-Stage Discovery
The early stages of drug discovery are critical yet often represent a major bottleneck in pharmaceutical research. Traditional approaches to target identification, compound screening, and drug repurposing, are labor-intensive, costly, and prone to high failure rates. Artificial intelligence (AI) has emerged as a powerful enabler, offering data-driven solutions that can systematically integrate vast biomedical datasets and predict meaningful insights with greater efficiency. By leveraging machine learning and generative algorithms, AI accelerates the identification of therapeutic targets, enables the design of novel compounds with optimized properties, and uncovers new uses for existing drugs. Collectively, discoveries by reducing timelines, lowering costs, and improving the probability of success in downstream development.
Target Identification and Validation: AI algorithms mine genomics, proteomics, and transcriptomics data to identify novel disease targets. Network-based ML models predict disease-associated genes and protein–protein interactions.
De Novo Molecule Design: Generative AI (GANs, diffusion models) designs novel chemical structures with optimized binding affinity, solubility, and toxicity profiles. Example: DeepMind’s AlphaFold enabled accurate protein structure prediction, revolutionizing rational drug design.
Drug Repurposing: AI platforms analyze clinical and molecular data to find new indications for approved drugs, accelerating entry to market.

AI in Preclinical Development
Preclinical development serves as a critical bridge between early discovery and clinical evaluation, aiming to establish the safety, efficacy, and pharmacokinetic profiles of candidate compounds. However, conventional preclinical methods are resource-intensive, time-consuming, and often limited in their predictive value for human outcomes. AI is increasingly applied to overcome these challenges by integrating large-scale experimental and computational data. Machine learning models can accurately predict absorption, distribution, metabolism, excretion, and toxicity (ADMET), thereby reducing the reliance on extensive in vivo testing. In addition, AI facilitates biomarker discovery and enhances high content imaging analysis, enabling deeper mechanistic insights and improved candidate prioritization. Together, these approaches streamline preclinical workflows, reduce attrition rates, and increase the likelihood of successful clinical translation.
ADMET Prediction: Machine learning predicts absorption, distribution, metabolism, excretion, and toxicity, reducing animal experiments. In silico ADMET modeling helps prioritize lead candidates.
Biomarker Discovery: AI integrates multi-omics datasets to identify predictive biomarkers for efficacy and toxicity. Supports patient selection strategies in oncology and rare diseases.
High-Content Imaging: Deep learning automates cellular imaging analysis for phenotypic screening. Example: Cell Painting assays integrated with AI identify subtle morphological changes in drug-treated cells.

AI in Clinical Development
Clinical development represents the most resource-intensive phase of drug research, with high costs, long timelines, and substantial risk of attrition. Traditional trial designs are often constrained by inefficient patient recruitment, limited adaptability, and delayed detection of safety signals, contributing to failures in late-stage programs. AI is increasingly applied to address these limitations by leveraging real-world data, electronic health records, and predictive modeling. AI enhances patient stratification and recruitment through data-driven identification of trial design by enabling real-time protocol modifications, and improves safety monitoring through automated extraction of adverse events from unstructured clinical data. Together, these applications increase trial efficiency, reduce costs, and improve the likelihood of regulatory success.
Patient Stratification and Recruitment: Predictive models analyze electronic health records (EHRs) and genomic profiles to identify eligible participants. Reduces recruitment bottlenecks and improves trial diversity.
Adaptive Trial Design: AI-driven simulations enable real-time protocol adjustments (dose modifications, patient subgroup enrichment).
Safety Signal Detection: NLP extracts adverse events from unstructured clinical notes, reports, and patient forums. Enables earlier safety interventions.

AI in Manufacturing and Quality Control
In manufacturing, AI enhances process development and scale-up by predicting critical process parameters, optimizing batch operations, and supporting the transition toward continuous manufacturing. Digital twins and predictive models enable real-time simulation and control, reducing variability and ensuring Practices (GMP). By integrating AI into Process Analytical Technology (PAT) frameworks, manufacturers can achieve robust process monitoring, rapid fault detection, and automated process adjustments.
In quality control, AI provides capabilities far beyond traditional testing approaches. Machine vision systems can identify subtle surface defects in tablets and capsules with precision surpassing human inspection, while deep learning algorithms classify spectra from spectroscopic and chromatographic methods to detect impurities or deviations rapidly. Predictive analytics strengthen quality assurance by forecasting deviations before they occur, reducing batch failures and accelerating release testing. Importantly, AI-driven systems support regulatory requirements for data integrity, reproducibility, and traceability, aligning with the principles of Quality by Design (QbD).
Process Optimization: AI supports continuous manufacturing by predicting optimal process parameters.
Real-Time Quality Control: Sensor-based AI models detect batch variability, reducing failures.
Predictive Maintenance: AI forecasts equipment malfunctions, minimizing downtime.

AI in Post-Market Surveillance
Post-market surveillance is essential to ensure the long-term safety, efficacy, and optimal use of approved drugs once they reach broader patient populations. Traditional pharmacovigilance systems rely on spontaneous reporting, which is often incomplete and delayed, limiting timely detection of adverse events. AI natural language processing (NLP), machine learning, and real-world data analytics to monitor drug performance drug reactions from electronic health records, scientific literature, and social media, providing earlier safety signals. Moreover, AI-driven integration of real-world evidence (RWE) supports regulatory decision-making, label updates, and post-approval studies. Additionally, predictive analytics enable personalized treatment pathways, ensuring that therapies are optimized for individual patient needs. Collectively, these applications strengthen pharmacovigilance, reduce risks, and enhance patient outcomes.
Pharmacovigilance: NLP identifies adverse drug reactions from scientific literature, regulatory databases, and social media.
Real-World Evidence (RWE): AI integrates data from insurance claims, registries, and wearable devices for postapproval monitoring.
Personalized Treatment Pathways: Predictive analytics support adaptive dosing and therapy personalization in oncology, metabolic, and neurodegenerative disorders.
Case Studies
Below are some of the case studies that showcase the potential of AI in drug discovery and preclinical development, as well as clinical development of the drugs in human trials:
1. Exscientia: The company integrates patient-derived data, such as tissue samples, into its AI platforms to enhance target selection, molecule design, experimentation, and clinical assessment. Its AI-driven small molecule design is leading to clinical candidates in oncology.
2. BenevolentAI: BenevolentAI is a UK-based biotechnology company that integrates artificial intelligence (AI), machine learning, and biomedical science to accelerate drug discovery and development. It uses its proprietary machine learning, natural language processing, and knowledge graphs to analyze biomedical data. Its AI-enabled drug repurposing technology identified baricitinib as a treatment candidate for COVID-19.
3. Insilico Medicine: Insilico Medicine is a global AI-driven drug discovery and development company recognized for its end-to-end generative AI platform that spans target discovery, molecule design, and clinical development. The company integrates deep learning, reinforcement learning, and generative chemistry to accelerate the design of novel therapeutic candidates. A landmark achievement was the advancement of a fully AI-designed molecule into human clinical trials in under 18 months, a process that traditionally takes several years.
Challenges and Limitations
Despite its transformative potential, the integration of artificial intelligence (AI) in pharmaceutical manufacturing and quality control faces several challenges that limit its widespread adoption:
1. Data Quality and Bias:
AI models rely heavily on large, high-quality datasets. Poorly annotated, incomplete, or biased datasets can lead to inaccurate predictions, misclassification of product quality, or flawed process optimization. Ensuring robust data curation, harmonization, and annotation remains a critical prerequisite.
2. Interpretability:
Many AI algorithms, particularly deep learning models, function as “black rationale behind predictions. Limited interpretability reduces trust among regulators and quality assurance teams, impeding acceptance in highly regulated pharmaceutical environments.
3. Integration Hurdles:
Pharmaceutical manufacturing pipelines often lack standardized frameworks for AI adoption. Integrating AI with existing Process Analytical Technology (PAT), Manufacturing Execution Systems (MES), and Good Manufacturing Practices (GMP) requires substantial infrastructure adaptation and cross-disciplinary expertise.
4. Ethical and Legal Issues:
AI applications in pharmaceuticals also raise ethical and legal concerns. Patient privacy must be safeguarded when using real-world or clinical datasets. Algorithmic fairness is essential to avoid biased outcomes, while intellectual property (IP) rights regarding AI-generated molecules or processes remain a subject of ongoing debate.
Future Perspectives
While current challenges constrain the widespread adoption of AI, several are poised to accelerate its impact on pharmaceutical manufacturing and quality control:
1. Explainable AI (XAI)
Explainable AI approaches are being developed to enhance transparency and interpretability of complex models. By providing human-understandable rationales for predictions, XAI can improve trust among regulators, facilitate validation, and support decision-making in highly regulated environments.
2. Federated Learning
Federated learning offers a promising solution for privacy-preserving, multi-institutional collaborations. By enabling AI models to train across decentralized datasets without sharing sensitive patient or proprietary data, it fosters collaborative innovation while safeguarding data integrity and compliance.
3. Regulatory Frameworks
Regulatory bodies such as the U.S. Food and Drug Administration (FDA) and the European Medicines Agency (EMA) are actively developing AI-specific guidelines. These evolving frameworks aim to standardize model validation, data integrity, and lifecycle management, paving the way for broader regulatory acceptance of AI-enabled systems.
4. Integration with Advanced Technologies
The convergence of AI with multi-omics datasets (genomics, proteomics, metabolomics), quantum computing, and digital twins has the potential to redefine drug R&D. Multi-omics integration will enable holistic disease modeling, quantum computing may unlock solutions to complex molecular design problems, and digital twins of manufacturing systems could allow real-time simulation, optimization, and predictive control of processes.
Together, these advancements position AI not just as a supportive tool, but as a transformative force driving the next era of pharmaceutical innovation.
Conclusion
AI is reshaping the pharmaceutical landscape, offering transformative opportunities across discovery, development, and delivery. While regulatory, ethical, and data challenges persist, the integration of AI promises faster, safer, and more cost-effective therapeutics. Cross-disciplinary collaborations between computational scientists, clinicians, and regulators will be essential to realize AI’s full potential.
References
1. Abramson J., Adler J., Dunger J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature. 2024;630:493–500. doi:10.1038/s41586-024-07487-w.
2. Corso G., Stärk H., Jing B., Barzilay R., Jaakkola T. DiffDock: Diffusion steps, twists, and turns for molecular docking. arXiv preprint. 2022; arXiv:2210.01776.
3. Chandrasekaran S.N., Cimini B.A., Goodale A. et al. Three million images and morphological profiles of cells treated with matched chemical and genetic perturbations. Nat Methods. 2024;21:1114–1121. doi:10.1038/s41592-024-02241-6.
4. Cross Ryan. Exscientia. C&EN Global Enterprise. 2017;95. doi:10.1021/cen-09544-cover6.
5. Stebbing J., Krishnan V., de Bono S. et al. Mechanism of baricitinib supports artificial intelligence–predicted testing in COVID-19 patients. EMBO Mol Med. 2020;12(8):e12697. doi:10.15252/emmm.202012697.
6. Xu Z., Ren F., Wang P. et al. A generative AI-discovered TNIK inhibitor for idiopathic pulmonary fibrosis: a randomized phase 2a trial. Nat Med. 2025;31, 2602–2610. doi:10.1038/s41591-025-03743-2.