• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

宿主前:使用机器学习对冠状病毒科进行宿主预测。

PREHOST: Host prediction of coronaviridae family using machine learning.

作者信息

Chaturvedi Anusha, Borkar Kushal, Priyakumar U Deva, Vinod P K

机构信息

International Institute of Information Technology, Hyderabad, Telangana, 500032, India.

出版信息

Heliyon. 2023 Feb;9(2):e13646. doi: 10.1016/j.heliyon.2023.e13646. Epub 2023 Feb 11.

DOI:10.1016/j.heliyon.2023.e13646
PMID:36816252
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9922161/
Abstract

Coronavirus, a zoonotic virus capable of transmitting infections from animals to humans, emerged as a pandemic recently. In such circumstances, it is essential to understand the virus's origin. In this study, we present a novel machine-learning pipeline for host prediction of the family, Coronaviridae. We leverage the complete viral genome and sequences at the protein level (spike protein, membrane protein, and nucleocapsid protein). Compared with the current state-of-the-art approaches, the random forest model attained high accuracy and recall scores of 99.91% and 0.98, respectively, for genome sequences. In addition to the spike protein sequences, our study shows membrane and nucleocapsid protein sequences can be utilized to predict the host of viruses. We also identified important sites in the viral sequences that help distinguish between different host classes. The host prediction pipeline will cater as a valuable tool to take effective measures to govern the transmission of future viruses.

摘要

冠状病毒是一种能够将感染从动物传播给人类的人畜共患病毒,最近成为了一种大流行病。在这种情况下,了解该病毒的起源至关重要。在本研究中,我们提出了一种用于冠状病毒科宿主预测的新型机器学习流程。我们利用完整的病毒基因组和蛋白质水平的序列(刺突蛋白、膜蛋白和核衣壳蛋白)。与当前最先进的方法相比,随机森林模型在基因组序列方面分别获得了99.91%和0.98的高精度和召回率。除了刺突蛋白序列外,我们的研究表明膜蛋白和核衣壳蛋白序列也可用于预测病毒的宿主。我们还在病毒序列中确定了有助于区分不同宿主类别的重要位点。宿主预测流程将作为一种有价值的工具,用于采取有效措施控制未来病毒的传播。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fd1/9957704/9f3dd5b782f1/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fd1/9957704/a5d6b14d3157/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fd1/9957704/445c23fb8d14/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fd1/9957704/4bb9750b5279/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fd1/9957704/ba290343df9e/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fd1/9957704/9f3dd5b782f1/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fd1/9957704/a5d6b14d3157/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fd1/9957704/445c23fb8d14/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fd1/9957704/4bb9750b5279/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fd1/9957704/ba290343df9e/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fd1/9957704/9f3dd5b782f1/gr5.jpg

相似文献

1
PREHOST: Host prediction of coronaviridae family using machine learning.宿主前:使用机器学习对冠状病毒科进行宿主预测。
Heliyon. 2023 Feb;9(2):e13646. doi: 10.1016/j.heliyon.2023.e13646. Epub 2023 Feb 11.
2
Predicting the animal hosts of coronaviruses from compositional biases of spike protein and whole genome sequences through machine learning.通过机器学习从刺突蛋白和全基因组序列的组成偏差预测冠状病毒的动物宿主。
PLoS Pathog. 2021 Apr 20;17(4):e1009149. doi: 10.1371/journal.ppat.1009149. eCollection 2021 Apr.
3
Machine learning methods accurately predict host specificity of coronaviruses based on spike sequences alone.机器学习方法仅基于刺突序列就能准确预测冠状病毒的宿主特异性。
Biochem Biophys Res Commun. 2020 Dec 10;533(3):553-558. doi: 10.1016/j.bbrc.2020.09.010. Epub 2020 Sep 18.
4
Comparative analysis of nucleocapsid and surface glycoprotein sequences.核衣壳蛋白和表面糖蛋白序列的比较分析。
Front Biosci (Landmark Ed). 2020 Jun 1;25(10):1894-1900. doi: 10.2741/4883.
5
Classification of SARS-CoV-2 viral genome sequences using Neurochaos Learning.利用神经混沌学习对 SARS-CoV-2 病毒基因组序列进行分类。
Med Biol Eng Comput. 2022 Aug;60(8):2245-2255. doi: 10.1007/s11517-022-02591-3. Epub 2022 Jun 7.
6
Prediction of virus-host infectious association by supervised learning methods.通过监督学习方法预测病毒-宿主感染关联。
BMC Bioinformatics. 2017 Mar 14;18(Suppl 3):60. doi: 10.1186/s12859-017-1473-7.
7
Machine Learning Approach Effectively Predicts Binding Between SARS-CoV-2 Spike and ACE2 Across Mammalian Species - Worldwide, 2021.机器学习方法有效预测2021年全球范围内严重急性呼吸综合征冠状病毒2刺突蛋白与哺乳动物物种血管紧张素转换酶2之间的结合
China CDC Wkly. 2021 Nov 12;3(46):967-972. doi: 10.46234/ccdcw2021.235.
8
Predicting host tropism of influenza A virus proteins using random forest.使用随机森林预测甲型流感病毒蛋白的宿主嗜性
BMC Med Genomics. 2014;7 Suppl 3(Suppl 3):S1. doi: 10.1186/1755-8794-7-S3-S1. Epub 2014 Dec 8.
9
Properties of Coronavirus and SARS-CoV-2.冠状病毒及新型冠状病毒(SARS-CoV-2)的特性
Malays J Pathol. 2020 Apr;42(1):3-11.
10
SARS-CoV-2 host prediction based on virus-host genetic features.基于病毒-宿主遗传特征的 SARS-CoV-2 宿主预测。
Sci Rep. 2022 Mar 17;12(1):4576. doi: 10.1038/s41598-022-08350-6.

引用本文的文献

1
The effect of taxonomic, host-dependent features and sample bias on virus host prediction using machine learning and short sequence k-mers.分类学、宿主依赖性特征和样本偏差对使用机器学习和短序列k-mer进行病毒宿主预测的影响。
Sci Rep. 2025 Aug 27;15(1):31592. doi: 10.1038/s41598-025-17123-w.

本文引用的文献

1
Predicting the mutational drivers of future SARS-CoV-2 variants of concern.预测未来引起关注的 SARS-CoV-2 变异株的突变驱动因素。
Sci Transl Med. 2022 Feb 23;14(633):eabk3445. doi: 10.1126/scitranslmed.abk3445.
2
Clinico-Genomic Analysis Reveals Mutations Associated with COVID-19 Disease Severity: Possible Modulation by RNA Structure.临床基因组分析揭示与新冠病毒疾病严重程度相关的突变:RNA结构可能的调节作用
Pathogens. 2021 Aug 31;10(9):1109. doi: 10.3390/pathogens10091109.
3
Structural Insight Into the SARS-CoV-2 Nucleocapsid Protein C-Terminal Domain Reveals a Novel Recognition Mechanism for Viral Transcriptional Regulatory Sequences.
对严重急性呼吸综合征冠状病毒2核衣壳蛋白C末端结构域的结构洞察揭示了病毒转录调控序列的一种新识别机制。
Front Chem. 2021 Jan 12;8:624765. doi: 10.3389/fchem.2020.624765. eCollection 2020.
4
Learning the language of viral evolution and escape.学习病毒进化与逃逸的语言。
Science. 2021 Jan 15;371(6526):284-288. doi: 10.1126/science.abd7331.
5
Structure and dynamics of membrane protein in SARS-CoV-2.新冠病毒膜蛋白的结构与动力学
J Biomol Struct Dyn. 2022 Jul;40(10):4725-4738. doi: 10.1080/07391102.2020.1861983. Epub 2020 Dec 22.
6
Machine learning methods accurately predict host specificity of coronaviruses based on spike sequences alone.机器学习方法仅基于刺突序列就能准确预测冠状病毒的宿主特异性。
Biochem Biophys Res Commun. 2020 Dec 10;533(3):553-558. doi: 10.1016/j.bbrc.2020.09.010. Epub 2020 Sep 18.
7
VIDHOP, viral host prediction with deep learning.VIDHOP,基于深度学习的病毒宿主预测。
Bioinformatics. 2021 Apr 20;37(3):318-325. doi: 10.1093/bioinformatics/btaa705.
8
Structural and functional properties of SARS-CoV-2 spike protein: potential antivirus drug development for COVID-19.SARS-CoV-2 刺突蛋白的结构和功能特性:COVID-19 的潜在抗病毒药物研发。
Acta Pharmacol Sin. 2020 Sep;41(9):1141-1149. doi: 10.1038/s41401-020-0485-4. Epub 2020 Aug 3.
9
Structural basis for inhibition of the RNA-dependent RNA polymerase from SARS-CoV-2 by remdesivir.瑞德西韦抑制 SARS-CoV-2 的 RNA 依赖性 RNA 聚合酶的结构基础。
Science. 2020 Jun 26;368(6498):1499-1504. doi: 10.1126/science.abc1560. Epub 2020 May 1.
10
Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor.SARS-CoV-2 刺突受体结合域与 ACE2 受体复合物的结构。
Nature. 2020 May;581(7807):215-220. doi: 10.1038/s41586-020-2180-5. Epub 2020 Mar 30.