GAN-enhanced Machine Learning and Metabolic Modeling Identify Reprogramming in Pancreatic Cancer

Tahereh Razmpour, Masoud Tabibian, Arman Roohi, Rajib Saha    

Abstract

Pancreatic ductal adenocarcinoma is one of the deadliest forms of cancer, presenting significant clinical challenges due to poor prognosis and limited treatment options. Understanding the metabolic reprogramming that drives this disease is crucial for identifying new therapeutic targets and improving patient outcomes. We developed a novel computational framework integrating genome-scale metabolic modeling with machine learning to identify metabolic signatures and therapeutic vulnerabilities in pancreatic cancer.

Introduction

Pancreatic Ductal Adenocarcinoma (PDAC) is a disease with poor prognosis and a highly aggressive form of cancer, largely due to late-stage diagnosis and limited treatment options. The majority of PDAC patients (80–85%) present with locally advanced or metastatic disease at diagnosis, when curative surgical resection is no longer feasible. This delayed detection dramatically impacts survival, as 5-year survival rates decrease from approximately 32% for localized disease to only 3% for metastatic disease.

Materials and Methods:

We obtained gene expression data for pancreatic ductal adenocarcinoma (PDAC) from The Cancer Genome Atlas (TCGA) database. To ensure specificity, we meticulously reviewed annotations and pathology reports for 183 cases, selecting only those explicitly classified as ductal adenocarcinoma. This rigorous process yielded 144 PDAC cases and 4 non-neoplastic pancreatic tissue samples, which served as our control group. Importantly, these 4 healthy samples represent adjacent non-neoplastic tissue from the same patients who provided cancer samples, allowing for patient-matched comparisons that control for individual biological variation (age, sex, genetic background, race, and environmental factors).

Discussion:

Our study presents a novel integrated approach combining genome-scale metabolic modeling with machine learning techniques to investigate metabolic reprogramming in pancreatic ductal adenocarcinoma (PDAC). This comprehensive analysis has revealed several key insights into PDAC metabolism and highlighted potential therapeutic targets.

The development of our GAN-based synthetic data generation method represents a significant advancement in addressing the persistent challenge of data imbalance in cancer research. Unlike previous studies that relied on traditional oversampling techniques or limited datasets, our approach generates biologically relevant synthetic samples while maintaining the complex relationships inherent in gene expression data.

Acknowledgments

We gratefully acknowledge Dr. Adil Alsiyabi and Andrea Goertzen for their invaluable guidance and support. We also thank the High-Performance Computing Center (HCC) at the University of Nebraska-Lincoln for providing essential computational resources for this work.

Citation: Razmpour T, Tabibian M, Roohi A, Saha R (2026) GAN-enhanced machine learning and metabolic modeling identify reprogramming in pancreatic cancer. PLoS Comput Biol 22(1): e1013862. https://doi.org/10.1371/journal.pcbi.1013862

Editor: Sunil Laxman, Institute for Stem Cell Science and Regenerative Medicine, INDIA

Received: July 30, 2025; Accepted: December 21, 2025; Published: January 2, 2026.

Copyright: © 2026 Razmpour et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All the codes and materials used for this study are available at https://github.com/ssbio/PDAC.

Funding: This study was supported by an NIGMS MIRA Award 5R35GM143009 to RS. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.