Betacoronavirus Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)
Program, Pipeline Name or Method Name:
Infernal, R-Scape, cm-builder
Program, Pipeline or Method version:
Infernal 1.1.2, R-Scape 1.5.16, cm-builder (https://github.com/dincarnato/labtools)
Materials & Methods (Description and/or Program Settings):
This download contains a zip file with the full results of an Infernal/R-scape/CaCo fold analysis of ScanFold predicted structural motifs. The ScanFold approach defines specific motifs of likely function. All predicted motifs containing at least one base pair with an average Zavg < -1. The first set of motifs is from a purely in silico ScanFold analysis (unconstrained) and the other is from an experimentally informed ScanFold analysis (AllTop10). The AllTop10 results contain motifs predicted with 3 RNA structure probing datasets considered during folding. Here, the top 10% of reactive nucleotides were set to be unpaired during the ScanFold-Scan process (which informs the ultimate model building by disallowing highly reactive nucleotides from being paired). Each of these predicted motifs has been queried (using the Incarnato lab's cm-builder script) against ~25K coronavirus genomes using the Eddy lab's Infernal program in order to generate covariation models of each and iteratively search the coronavirus genome for homologs. If a covariation model with homologs is successfully built, the resulting covariation model stockholm alignment was then tested using the R-scape program for evidence of statistically significant covariation. Here base pairs with significant covariation (GTp test; E < 0.05) are highlighted in green. Additionally all stockholms were used to build models using the CaCo fold algorithm (which uses positive and negative covariation signals to identify base pairs).