Neil S. Zheng, Cosby A. Stone, Lan Jiang, Christian M. Shaffer, V. Eric Kerchberger, Cecilia P. Chung, QiPing Feng, Nancy J. Cox, C. Michael Stein, Dan M. Roden, Joshua C. Denny, Elizabeth J. Phillips, Wei-Qi Wei
Understanding the contribution of genetic variation to drug response can improve the delivery of precision medicine. However, genome-wide association studies (GWAS) for drug response are uncommon and are often hindered by small sample sizes.
Genome-wide association studies (GWAS) have contributed substantially to precision medicine, providing critical insights into the physiological and pathophysiological mechanisms of human complex traits and diseases.
Materials and methods
Identifying adverse drug reactions in EHRs
For a given patient, allergy sections across all their clinical notes were extracted as free text. The data in an allergy section is often semi-structured (e.g., pcn [rash] and sulfa [itching]), but formatting can vary depending on the healthcare provider who entered the data.
Genotyping and SNP imputation
Genotyping was performed on the Infinium Multi-Ethnic Genotyping Array (MEGAchip). We excluded DNA samples: (1) with per-individual call rate < 95%; (2) with wrongly assigned sex; (3) with a cryptic relationship closer than a third-degree relative (proportion identity by descent ≥0.25); or (4) unexpected duplication.
All statistical analyses were performed with PLINK 2.0. This study included 81,739 individuals from the Vanderbilt University Medical Center’s BioVU DNA Biobank,  including GWAS data from 67,323 individuals with self-reported European ancestry and trans-ethnic validation using 14,416 individuals with self-reported African ancestry.
In this study, we present a high-throughput and scalable approach to conduct large-scale, genome-wide analyses for adverse drug reactions. Our framework can be adapted or shared between institutions, helping facilitate collaboration between sites. Utilizing EHRs allowed us to study ADRs in individuals with diverse clinical and ethnic backgrounds under the conditions of routine clinical care.
Citation: Zheng NS, Stone CA, Jiang L, Shaffer CM, Kerchberger VE, Chung CP, et al. (2021) High-throughput framework for genetic analyses of adverse drug reactions using electronic health records. PLoS Genet 17(6): e1009593. https://doi.org/10.1371/journal.pgen.1009593
Editor: Gregory M. Cooper, HudsonAlpha Institute for Biotechnology, UNITED STATES
Received: November 3, 2020; Accepted: May 10, 2021; Published: June 1, 2021.
Copyright: This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.
Data Availability: Data cannot be shared publicly because it includes confidential genetic and electronic health record data.
Funding: The study was supported by National Institutes of Health, under grant numbers R01 HL133786 (W-QW), R35 GM131770 (CMS), P50 GM115305 (JCD, EJP, DMR), and R01 HG010863 (EJP).
Competing interests: The authors have declared that no competing interests exist.