Theoretical Department. National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia.
Department of Chemistry and Biochemistry, University of Minnesota, Duluth, USA.
Curr Comput Aided Drug Des. 2021;17(7):936-945. doi: 10.2174/1573409917666210202092646.
Coronaviruses comprise a group of enveloped, positive-sense single-stranded RNA viruses that infect humans as well as a wide range of animals. The study was performed on a set of 573 sequences belonging to SARS, MERS and SARS-CoV-2 (CoVID-19) viruses. The sequences were represented with alignment-free sequence descriptors and analyzed with different chemometric methods: Euclidean/Mahalanobis distances, principal component analysis and self-organizing maps (Kohonen networks). We report the cluster structures of the data. The sequences are well-clustered regarding the type of virus; however, some of them show the tendency to belong to more than one virus type.
This is a study of 573 genome sequences belonging to SARS, MERS and SARS-- CoV-2 (CoVID-19) coronaviruses.
The aim was to compare the virus sequences, which originate from different places around the world.
The study used alignment free sequence descriptors for the representation of sequences and chemometric methods for analyzing clusters.
Majority of genome sequences are clustered with respect to the virus type, but some of them are outliers.
We indicate 71 sequences, which tend to belong to more than one cluster.
冠状病毒属于一组包膜的、正链单链 RNA 病毒,可感染人类和多种动物。本研究对一组属于 SARS、MERS 和 SARS-CoV-2(新冠病毒)的 573 个序列进行了研究。这些序列使用无比对序列描述符表示,并使用不同的化学计量学方法进行了分析:欧几里得/马哈拉诺比斯距离、主成分分析和自组织映射(Kohonen 网络)。我们报告了数据的聚类结构。根据病毒类型,这些序列聚类良好;然而,其中一些序列表现出属于多种病毒类型的趋势。
这是一项对来自世界各地的 573 个 SARS、MERS 和 SARS-CoV-2(新冠病毒)冠状病毒基因组序列的研究。
旨在比较源自不同地区的病毒序列。
该研究使用无比对序列描述符表示序列,并使用化学计量学方法分析聚类。
大多数基因组序列根据病毒类型聚类,但其中一些是异常值。
我们指出了 71 个倾向于属于多个聚类的序列。