Pharma Focus Asia

Partitioning stable and unstable expression level variation in cell populations: A theoretical framework and its application to the T cell receptor

Thiago S. Guzella, Vasco M. Barreto, Jorge Carneiro

Abstract

Phenotypic variation in the copy number of gene products expressed by cells or tissues has been the focus of intense investigation. To what extent the observed differences in cellular expression levels are persistent or transient is an intriguing question. Here, we develop a quantitative framework that resolves the expression variation into stable and unstable components. The difference between the expression means in two cohorts isolated from any cell population is shown to converge to an asymptotic value, with a characteristic time, τT, that measures the timescale of the unstable dynamics. The asymptotic difference in the means, relative to the initial value, measures the stable proportion of the original population variance  . Empowered by this insight, we analysed the T-cell receptor (TCR) expression variation in CD4 T cells. About 70% of TCR expression variance is stable in a diverse polyclonal population, while over 80% of the variance in an isogenic TCR transgenic population is volatile. In both populations the TCR levels fluctuate with a characteristic time of 32 hours. This systematic characterisation of the expression variation dynamics, relying on time series of cohorts’ means, can be combined with technologies that measure gene or protein expression in single cells or in bulk.

Introduction

The phenotypic variation among organisms or cells is a theme of growing importance in biology. Macroscopic phenotypes, such as body structures or physiologic responses, have been studied for ages, but one phenotype particularly suitable for quantification that has received attention in the last decades is the amount of specific mRNAs and proteins expressed by single cells. Advances in genomics have allowed the analysis of genetic contributions to variation in gene expression, in terms of so-called expression quantitative trait loci (eQTL) [1, 2]. In this case, expression levels, typically assessed via mRNA levels, are treated as quantitative traits, and one is interested in the specific loci underlying variation in expression levels among different individuals. The increasing availability of single-cell resolution genomics, proteomics and metabolomics technologies has enabled molecular biologists to analyse cell lineages and tissues showing that what were previously perceived as homogeneous cell populations are in fact a complex mixture of often transient and interchangeable cellular types and cellular states (see discussion in [3]). In parallel to these studies linking phenotypes to genotype, the literature on stochastic gene expression [4–8], reviewed in [9], has brought to light the variation in expression levels in isogenic cells, even when these are in the same cellular state and in the same environment. The variation is typically attributed to the “noise” resulting from the small copy number of molecules involved in the process.

Materials and methods

Ethics statement

This research project was ethically reviewed and approved by the Ethics Committee of the Instituto Gulbenkian de Ciência, and by the Portuguese National Entity that regulates the use of laboratory animals (DGAV—Direção Geral de Alimentação e Veterinária (license reference: 0421/000/000/2013). All experiments conducted on animals followed the Portuguese (Decreto-Lei number 113/2013) and European (Directive 2010/63/EU) legislations, concerning housing, husbandry and animal welfare.

Discussion

In this article, we introduce a new approach to analyse the variation in protein expression levels in a cell population, which enables measuring the characteristic dynamics of the fluctuations in cellular expression and estimating the magnitude of stable and unstable contributions to the variation across cells. The analysis is based on the realisation that the difference between the means of log-transformed expression levels in two selected cohorts isolated from a population of interest converges with approximate exponential dynamics to an asymptotic value. By normalising this asymptotic value by the difference in cohorts’ means immediately after their isolation one obtains an unbiased estimation of the proportion of population variance that is explained by the stable component  , while the mean convergence time τT measures the timescale of unstable component dynamics. This key insight stems from perceiving any cell population as a mixture of many independent subpopulations, each with a characteristic mean expression level, that is fixed yet distributed among the subpopulations. Under these assumptions, the population variance is equated to the sum of the variance of the subpopulations means, which embodies the stable component of variation, and the variance of the expression level within the subpopulations, which represents the unstable component.

Acknowledgments

We are grateful to Jocelyne Demengeot and Henrique Teotónio for the support during the development of this work and to Alberto Darszon and Vera Martins for reading an earlier version of this manuscript. We thank Rui Gardner, Telma Lopes and Cláudia Bispo for assistance on flow cytometry analysis and cell sorting.

Citation: Guzella TS, Barreto VM, Carneiro J (2020) Partitioning stable and unstable expression level variation in cell populations: A theoretical framework and its application to the T cell receptor. PLoS Comput Biol 16(8): e1007910. https://doi.org/10.1371/journal.pcbi.1007910

Editor: Martin Meier-Schellersheim, National Institutes of Health, UNITED STATES

Received: July 18, 2019; Accepted: April 24, 2020; Published: August 25, 2020

Copyright: © 2020 Guzella et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The code and data sets are freely available in the following URLs in the data repository of the Instituto Gulbenkian de Ciência: http://downloads.igc.gulbenkian.pt/jcarneir/GuzellaetalPLoSComputBiol_code.zip http://downloads.igc.gulbenkian.pt/jcarneir/GuzellaetalPLoSComputBiol_data.zip.

Funding: This work was supported by a grant from the Fundação para a Ciência e Tecnologia (FCT) (PTDC/BIA-BCM/108020/2008). TSG was supported by a fellowship from FCT (SFRH/BD/33572/2008). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

magazine-slider-imageCytiva - Supor Prime filtersMFA + MMA 2024CPHI China || PMEC China 2024Asia Healthcare Week 2024CPHI Korea 2024CHEMICAL INDONESIA 2024World Orphan Drug Congress Europe 2024INALAB 2024Thermo Fisher - Drug Discovery and the impact of mAbsAdvanced Therapies USA 2024ISPE Singapore Affiliate Conference & Exhibition 20242024 PDA Cell and Gene Pharmaceutical Products Conference 2024 PDA Aseptic Manufacturing Excellence Conference2024 PDA Aseptic Processing of Biopharmaceuticals Conference3rd World ADC Asia 2024LogiPharma Asia 2024