Mapping Genes for Human Face Shape: Exploration of Univariate Phenotyping Strategies

Meng Yuan, Seppe Goovaerts, Michiel Vanneste, Harold Matthews, Hanne Hoskens, Stephen Richmond, Ophir D. Klein, Richard A. Spritz, Benedikt Hallgrimsson, Susan Walsh, Mark D. Shriver, John R. Shaffer, Seth M. Weinberg, Hilde Peeters, Peter Claes.

Abstract

Human facial shape, while strongly heritable, involves both genetic and structural complexity, necessitating precise phenotyping for accurate assessment. Common phenotyping strategies include simplifying 3D facial features into univariate traits such as anthropometric measurements (e.g., inter-landmark distances), unsupervised dimensionality reductions (e.g., principal component analysis (PCA) and auto-encoder (AE) approaches), and assessing resemblance to particular facial gestalts (e.g., syndromic facial archetypes).

Introduction

Human facial development is highly complex, resulting in a rich diversity of facial appearances both within and among populations. Furthermore, facial features have a strong genetic basis, readily apparent within families. The genome-wide association scan (GWAS) is an agnostic approach designed to investigate the statistical relationship between phenotypic traits and genetic variants. A typical GWAS involves individually testing millions of single nucleotide polymorphisms (SNPs) or other common variants dispersed across the genome. Because the precise location of SNPs and genes is known, GWAS signals showing strong evidence of association can point to genes of interest.

Methods:

We have complied with all relevant ethical regulations for work with human participants and informed consent was obtained. Institutional review board (IRB) approval was obtained at each recruitment site and all participants gave their written informed consent prior to participation; for children, written consent was obtained from a parent or legal guardian.

We used a subset from the syndromic face dataset in our previous work [45], where it was originally applied for a syndrome classification task.

These control images were used to determine whether the average syndromic images were significantly different from those of the healthy controls for each syndrome group.

Discussion

In this study, we evaluated and compared different techniques for extracting univariate facial phenotypes in humans, quantified from 3D facial images. Traditional anthropometric traits, such as inter-landmark distances, demonstrated the highest mean heritability suggesting that they are well focused towards genetically determined aspects of shape variation. While the set of inter-landmark distances yielded a relatively high number of GWAS loci compared to a similarly sized set of traits from a different phenotyping category, the total number of loci identified was ultimately limited by the number of available landmarks.

Acknowledgments

We are extremely grateful to all the individuals and families who took part in this study, the midwives for their help in recruiting them and the whole teams at ALSPAC, KU Leuven, and the universities of Pittsburgh, IUI, and Penn State which includes interviewers, computer and laboratory technicians, clerical workers, research scientists, volunteers, managers, receptionists, and nurses.

We acknowledge the use of ChatGPT v3.5 (https://chat.openai.com/) for English language editing. More specifically, ChatGPT v3.5 was used to check English spelling and grammar, without changing meaning or adding content.

Citation: Yuan M, Goovaerts S, Vanneste M, Matthews H, Hoskens H, Richmond S, et al. (2024) Mapping genes for human face shape: Exploration of univariate phenotyping strategies. PLoS Comput Biol 20(12): e1012617. https://doi.org/10.1371/journal.pcbi.1012617

Editor: Xin He, University of Chicago Pritzker School of Medicine, UNITED STATES OF AMERICA

Received: April 7, 2024; Accepted: November 5, 2024; Published: December 2, 2024.

Copyright: © 2024 Yuan et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The genotype data of the 3DFN dataset are accessible via the dbGaP controlled access repository (http://www.ncbi.nlm.nih.gov/gap) at accession number phs000949.v1. p1. The phenotype data, represented as 3D facial surface in .obj format, are available through the FaceBase Consortium (https://www.facebase.org) at accession number FB00000491.01. Access to these 3D facial surface models requires proper institutional ethics approval and approval from the FaceBase data access committee. The FaceBase repository in the syndromic face database, “Developing 3D Craniofacial Morphometry Data and Tools to Transform Dysmorphology”, collected at patient support groups in the USA, Canada, and the UK. Facial images are available through FaceBase (https://www.facebase.org/chaise/record/#1/isa:dataset/accession=FB00000861). The participants making up the Peter Hammond’s legacy 3D dysmorphology dataset, Penn State University (PSU) and Indiana University Indianapolis (IUI) datasets were not collected with broad data sharing consent. Given the highly identifiable nature of both facial and genomic information and unresolved issues regarding risks to participants of reidentification, participants were not consented for inclusion in public repositories or the posting of individual data. This restriction is not because of any personal or commercial interests. Further information about access to the raw 3D facial images and/or genomic data can be obtained from the respective ethics committees; the Ethics Committee Research UZ/KU Leuven ([email protected]), the PSU IRB ([email protected]), and the IUI IRB ([email protected]) for the Peter Hammond’s legacy, PSU and IUI datasets, respectively. For the ALSPAC (UK) data, please note that the study website contains details of all the data that is available through a fully searchable data dictionary and variable search tool (http://www.bristol.ac.uk/alspac/researchers/our-data/). Genome wide genotyping data was generated by Sample Logistics and Genotyping Facilities at Welcome Sanger Institute and LabCorp (Laboratory Corporation of America) using support from 23andMe. All relevant source data for future replications are provided online (https://doi.org/10.6084/m9.figshare.24867063). This includes: the facial template, nasal landmark labels, the mesh simplification scheme used in AE models, the list of genetic loci associated with the nose and face shape, the GO biological processes based on the union set of lead SNPs from all groups of phenotypes, and the LocusZoom plots for each significant SNP based on different phenotyping methods. An example of LocusZoom plot can be found in Fig G in S1 File. Code availability KU Leuven provides the MeshMonk v.0.0.6 spatially dense facial-mapping software, free to use for academic purposes (https://github.com/TheWebMonks/meshmonk). MATLAB R2017b implementations of the hierarchical spectral clustering to obtain nasal segmentation are available from a previous publication (https://doi.org/10.6084/m9.figshare.7649024). Code for training AE models is available at https://github.com/mm-yuan/autoencoder_3dface. The analyses in this work were based on functions in MALAB R2022b, Python v3.7.8, MeshMonk v0.0.6, MeshLab v2020.03, LDSC v.1.0.1, GREAT v4.0.4.

Funding: The KU Leuven research team (P.C., M.Y., S.G.) and analyses were supported by the Research Fund KU Leuven (BOF-C1, C14/20/081), and the Research Foundation-Flanders (FWO, G0D1923N). This work was funded in part by grants from the National Institute of Dental and Craniofacial Research: R01-DE027023 (S.M.W., J.R.S., P.C.) and U01DE024440 (R.A.S., O.D.K., B.H.). The UK Medical Research Council and Wellcome (Grant ref: 217065/Z/19/Z) and the University of Bristol provide core support for ALSPAC. A comprehensive list of grants funding is available on the ALSPAC website (http://www.bristol.ac.uk/alspac/external/documents/grant-acknowledgements.pdf). Funding for the collection of 3D face shape scans was specifically provided by the MRC and Wellcome Trust (092731) and the University of Cardiff. This publication is the work of the authors, and they will serve as guarantors for the contents of this paper. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.