Alison Ziesel, Hosna Jabbari.
Abstract
SARS-CoV-2, the causative agent of COVID-19, is known to exhibit secondary structures in its 5’ and 3’ untranslated regions, along with the frameshifting stimulatory element situated between ORF1a and 1b. To identify additional regions containing conserved structures, we utilized a multiple sequence alignment with related coronaviruses as a starting point. We applied a computational pipeline developed for identifying non-coding RNA elements.
Introduction
SARS-CoV-2, the virus responsible for covid-19, is a member of the clade Betacoronavirus, and is a positive sense, single stranded RNA virus, with a genome size of 29,903 nucleotides. It has a possible zoonotic origin, with its most recent non-human host possibly a bat species. SARS-CoV-2 is capable of forming RNA secondary structure, which is the phenomenon where an RNA molecule self-base pairs to form a non-linear structure.
Materials and Methods:
The genomes of thirteen viruses, including the reference genome for SARS-CoV-2, were obtained from NCBI’s Nucleotide database.
Our analyses included viruses belonging to the genus Betacoronoviridae, coronaviruses known to infect humans, and closely related non-human host coronaviruses.
Guided by the previously constructed phylogenetic tree, MULTIZ-TBA then produces aligned blocksets of sequence projected against a genome of choice, in this case SARS-CoV-2.
Discussion:
Forty subregions of the SARS-CoV-2 genome were predicted by our pipeline as very likely to contain RNA secondary structure. These structures tend towards the 5’ end and 3’ third of the genome and cover structures already known to exist in the SARS-CoV-2 genome, including portions of the 5’ UTR and FSE, although the structures predicted here do not perfectly recapitulate those putative structures.
Acknowledgments:
AZ would like to thank Morgan Cunningham for his advice regarding appropriate statistical analyses.
Citation: Ziesel A, Jabbari H (2024) Unveiling hidden structural patterns in the SARS-CoV-2 genome: Computational insights and comparative analysis. PLoS ONE 19(4): e0298164. https://doi.org/10.1371/journal.pone.0298164
Editor: Salman Sadullah Usmani, Albert Einstein College of Medicine, UNITED STATES
Received: October 6, 2023; Accepted: January 19, 2024; Published: April 4, 2024.
Copyright: © 2024 Ziesel, Jabbari. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: Data and scripts employed in this study may be found at https://doi.org/10.5281/zenodo.8298680.
Funding: This study was financially supported by Microsoft Azure AI for Health (https://www.microsoft.com/en-us/research/project/ai-for-health) in the form of an award received by HJ. This study was also financially supported by Natural Sciences and Engineering Research Council of Canada (https://www.nserc-crsng.gc.ca) in the form of a NSERC Discovery grant (RGPIN-2020-04243) received by HJ. This study was also financially supported by National Research Council of Canada (https://nrc.canada.ca) in the form of a DHGA grant (DHGA-110-1) received by HJ. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.