AI-first Structural Identification of Pathogenic Protein Target Interfaces

Mihkel Saluri, Michael Landreh, Patrick Bryant.

Abstract

The risk of pandemics is increasing as global population growth and interconnectedness accelerate. Understanding the structural basis of protein-protein interactions between pathogens and hosts is critical for elucidating pathogenic mechanisms and guiding treatment or vaccine development. Despite 21,064 experimentally supported human-pathogen interactions in the HPIDB, only 52 have resolved structures in the PDB, representing just 0.2%. Advances in protein complex structure prediction, such as AlphaFold, now enable highly accurate modelling of heterodimeric complexes, though their application to host-pathogen interactions, which have distinct evolutionary dynamics, remains underexplored.

Introduction

During the recent pandemic outbreak of SARS-COV-2, the importance of obtaining fast insights into an emerging pathogen and its relationship with the host has become clear. Information about the interaction between the Spike protein and the human ACE2 receptor provided essential structural information for vaccine development and design. If this information could have been obtained earlier, it is possible that the pandemic would have had less of an impact on society due to vaccines and treatments being developed and deployed faster.

Methods:

All heteromeric protein structures with below 5 Å resolution and experimental technique X-ray crystallography or electron microscopy were downloaded from the PDB on 2021-12-20. From these structures, PFAM domains and species were mapped to Uniprot KB, keeping all structures with UniprotKB annotations. All structures that contain interacting sequences from at least two different Superkingdoms and have different PFAM domains were thereafter selected.

Discussion

With the advent of highly accurate structure prediction, exemplified by AlphaFold2, it has become possible to systematically expand structural knowledge across a wide range of organisms. This technological leap opens entirely new prospects for rational vaccine and drug development by enabling rapid identification of potential therapeutic targets. In this study, we present an AI-guided framework for host-pathogen structure prediction, aimed at uncovering novel interactions of functional and clinical relevance.

Citation: Saluri M, Landreh M, Bryant P (2025) AI-first structural identification of pathogenic protein target interfaces. PLoS Comput Biol 21(6): e1013168. https://doi.org/10.1371/journal.pcbi.1013168

Editor: Jeffrey Skolnick, Georgia Institute of Technology, UNITED STATES OF AMERICA

Received: March 11, 2025; Accepted: May 26, 2025; Published: June 26, 2025.

Copyright: © 2025 Saluri et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All data and code used to produce the results here are freely available in this gitlab repository: https://gitlab.com/patrickbryant1/hpopt.

Funding: This study was supported by the SciLifeLab & Wallenberg Data Driven Life Science Program (grant: KAW 2020.0239, P.B). Computational resources were enabled by the supercomputing resource Berzelius provided by National Supercomputer Centre at Linköping University and the Knut and Alice Wallenberg foundation with project ids berzelius-2021-29, Berzelius-2023-267, Berzelius-2024-78 and Berzelius-2024-292 (P.B.). M.L. is supported by a Karolinska Institutet faculty-funded Career Position, a Cancerfonden Project grant, the Swedish Research Council (VR) Research Environment Grant, a Consolidator Grant from the Swedish Society for Medical Research (SSMF), and the Knut and Alice Wallenberg foundation (2022.0032). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.