Stochastic ordering of complexoform protein assembly by genetic circuits

Mikkel Herholdt Jensen ,Eliza J. Morris, Hai Tran, Michael A. Nash, Cheemeng Tan

Abstract

Top-down proteomics has enabled the elucidation of heterogeneous protein complexes with different cofactors, post-translational modifications, and protein membership. This heterogeneity is believed to play a previously unknown role in cellular processes. The different molecular forms of a protein complex have come to be called “complex isoform” or “complexoform”. Despite the elucidation of the complexoform, it remains unclear how and whether cellular circuits control the distribution of a complexoform. To help address this issue, we first simulate a generic three-protein complexoform to reveal the control of its distribution by the timing of gene transcription, mRNA translation, and protein transport. Overall, we ran 265 computational experiments: each averaged over 1,000 stochastic simulations. Based on the experiments, we show that genes arranged in a single operon, a cascade, or as two operons all give rise to the different protein composition of complexoform because of timing differences in protein-synthesis order. We also show that changes in the kinetics of expression, protein transport, or protein binding dramatically alter the distribution of the complexoform. Furthermore, both stochastic and transient kinetics control the assembly of the complexoform when the expression and assembly occur concurrently. We test our model against the biological cellulosome system. With biologically relevant rates, we find that the genetic circuitry controls the average final complexoform assembly and the variation in the assembly structure. Our results highlight the importance of both the genetic circuit architecture and kinetics in determining the distribution of a complexoform. Our work has a broad impact on our understanding of non-equilibrium processes in both living and synthetic biological systems.

Introduction

Proteins are synthesized in specific orders to assemble large protein complexes, such as microtubule, proteasome, ribosomes, and cellulosome. These protein complexes are assembled both inside and outside cells through the coordination of gene expression, protein transport, and binding processes. Prior work has been assuming that protein complexes have a homogeneous composition of protein members. Yet, recent top-down proteomics shows that protein complexes can compose of different cofactors, post-translational modifications, and protein membership [1–3]. The different molecular forms of a protein complex have come to be called the complex isoforms or complexoforms [1]. For example, recent work shows that the yeast homotetrameric FBP1 complex can co-exist with 0 to 4 phosphorylated amino acids [4]. Bacteria cellulosomes are also found to exist in heterogeneous compositions [5–11]. Furthermore, a recent computational study [12] has investigated the formation of protein complexes using existing data on protein-protein interaction networks. This prior work shows that the composition of a protein complex can drift over time even when the simulation starts from the same initial condition. The work suggests that other cellular mechanisms must exist to prevent the compositional drift of some protein complexes.

Methods

Modeling protein expression, export, and binding

Our computational model consists of a set of coupled biochemical and physical processes, starting from transcription, and ending with the binding of protein products to 10 scaffold proteins, each with two docking sites to form a final complexoform. In the first process, an RNA polymerase binds to a promoter to synthesize mRNA. This transcription step is followed by the translation of two different protein products, denoted X and Y. The proteins are then transported for binding to a scaffold protein to form a three-protein complexoform. The model, which is summarized schematically in Fig 1, also incorporates the degradation of mRNA and diffusive loss of proteins after transport.

Discussion

In summary, our work shows that the underlying genetic circuit architecture does modulate the protein assembly. However, it is the interplay between the circuit architecture and the genetic and physical rate kinetics that together determine the protein assembly structure. We demonstrate two distinct behaviors of kinetic assembly: a slow equilibrium regime, in which the average assembly is well described by equilibrium statistical mechanics, and a fast non-equilibrium regime, in which the average assembly arrests before the system reaches equilibrium. Regardless of the equilibrium or non-equilibrium regime, the cumulative protein concentrations (i.e., the total amount of protein available to bind over time) determine the eventual complexoform distribution (S5 Appendix). Furthermore, we demonstrate that the two regimes can be regulated by tuning any of the kinetic rates involved in the protein expression and assembly process, whether biochemical or physical. The arresting of the assembly into a non-equilibrium structure has previously been observed on much larger length scales, such as in the dynamic arrest occurring in macroscopic protein assemblies such as biopolymer networks, in which the kinetics of assembly can highly affect the non-equilibrium assembly structure [31–34]. Our work shows that similar dynamic arrest can occur on a much smaller scale as well, in the assembly of protein complexes involving just a handful of individual proteins. The results highlight new mechanisms, in addition to restrictive or preferential binding, through which systems can control stochastic processes such as protein assembly. The results also underscore the importance of both kinetics and stochastic non-equilibrium behavior in addition to the genetic circuit architecture as modulators of protein assembly processes.

Citation: Jensen MH, Morris EJ, Tran H, Nash MA, Tan C (2020) Stochastic ordering of complexoform protein assembly by genetic circuits. PLoS Comput Biol 16(6): e1007997. https://doi.org/10.1371/journal.pcbi.1007997

Editor: James R. Faeder, University of Pittsburgh, UNITED STATES

Received: January 21, 2020; Accepted: May 28, 2020; Published: June 29, 2020

Copyright: © 2020 Jensen et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All data are included in S1 Data.

Funding: The work is supported by Human Frontier Science Program (FR) (RGY0080/2015), California State University, Sacramento, Department of Physics and Astronomy Chien Hu Research Award Program, the Edwin L. Iloff Endowment, and NSF (1808237). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.