Arumay Pal, Yaakov Levy
DNA sequences are often recognized by multi-domain proteins that may have higher affinity and specificity than single-domain proteins. However, the higher affinity to DNA might be coupled with slower recognition kinetics. In this study, we address this balance between stability and kinetics for multi-domain Cys2His2- (C2H2-) type zinc-finger (ZF) proteins. These proteins are the most prevalent DNA-binding domain in eukaryotes and C2H2 type zinc-finger proteins (C2H2-ZFPs) constitute nearly one-half of all known and predicted transcription factors in human. Extensive contact with DNA via tandem ZF domains confers high stability on the sequence-specific complexes. However, this can limit target search efficiency, especially for low abundance ZFPs. Earlier, we found that asymmetrical distribution of electrostatic charge among the three ZF domains of the low abundance transcription factor Egr-1 facilitates its DNA search process. Here, on a diverse set of 273 human C2H2-ZFP comprised of 3–15 tandem ZF domains, we find that, in many cases, electrostatic charge and binding specificity are asymmetrically distributed among the ZF domains so that neighbouring domains have different DNA-binding properties. For proteins containing 3–6 ZF domains, we show that the low abundance proteins possess a higher degree of non-specific asymmetry and vice versa. Our findings suggest that where the electrostatics of tandem ZF domains are similar (i.e., symmetrical), the ZFPs are more abundant to optimize their DNA search efficiency. This study reveals new insights into the fundamental determinants of recognition by C2H2-ZFPs of their DNA binding sites in the cellular landscape. The importance of electrostatic asymmetry with respect to binding site recognition by C2H2-ZFPs suggests the possibility that it may also be important in other ZFP systems and reveals a new design feature for zinc finger engineering.
Multi-domain proteins are prevalent in eukaryotic systems and are involved in a variety of cellular functions[1,2]. The structural complexity of such proteins can assist in regulating binding via a network of protein–protein interactions. Multi-domain proteins that interact with DNA can be biologically useful to achieve higher specificity or tighter binding. In many cases, cooperation between the tethered domains of multi-domain transcription factors were reported to be crucial for efficient binding to the DNA promoter[4,5]. The cooperation between the tethered domains can also support facilitated-dissociation mechanism from DNA[6,7].
Materials and methods
Dataset of zinc-finger proteins
The set of human C2H2 type ZF protein sequences used in this study was built by first searching the UniProt database with the following queries: “annotation type–Zinc-finger”, “organism–human”, “existence–evidence at protein level,” and “reviewed–yes”. These queries yielded 477 C2H2-type ZFPs, in which the number of ZF domains in each protein varied immensely, from 1 (e.g., human protein arginine N-methyltransferase 3, Uniprot id: O60678) to 31 (e.g., zinc finger protein 142, UniProt id: P52746). Since a cluster of consecutive C2H2 ZF domains is thought to ‘canonically’ bind the major groove of DNA only when connected by short linkers[17,27], we further filtered to include only proteins whose ZF domains are connected by a linker shorter than nine residues. This filter produced a dataset of 237 unique ZFPs containing 3–15 C2H2-type ZF domains (S1 Table) and including 1911 ZF domains in total.
In this study, we investigated the linkage between the degree of asymmetry in the tethered domains of multi-domain DNA-binding proteins and their cellular abundance. Asymmetry can be quantified by comparing various biophysical properties of each of the constituent domains. The properties of interest in this study were electrostatic charge and binding specificity, with the degree of asymmetry in their distributions across ZF domains expressed as their non-specific and specific binding asymmetries, respectively. We applied our study to a dataset of 273 human C2H2-ZFP comprised of 3–15 tandem ZF domains. Focusing on ZF domains is advantageous because they are a common motif that permits statistical analysis as well as having a relatively simple interface with DNA.
YL is The Morton and Gladys Pickman professional chair in Structural Biology.
Citation: Pal A, Levy Y (2020) Balance between asymmetry and abundance in multi-domain DNA-binding proteins may regulate the kinetics of their binding to DNA. PLoS Comput Biol 16(5): e1007867. https://doi.org/10.1371/journal.pcbi.1007867
Editor: Shi-Jie Chen, University of Missouri, UNITED STATES
Received: November 25, 2019; Accepted: April 11, 2020; Published: May 26, 2020
Copyright: © 2020 Pal, Levy. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Data Availability: All relevant data are within the manuscript and its Supporting Information files
Funding: The author(s) received no specific funding for this work
Competing interests: The authors have declared that no competing interests exist