• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于氨基酸编码,使用机器学习算法对新型冠状病毒肺炎进行分类。

Classifying COVID-19 based on amino acids encoding with machine learning algorithms.

作者信息

Alkady Walaa, ElBahnasy Khaled, Leiva Víctor, Gad Walaa

机构信息

Department of Bioinformatics, Faculty of Computer and Information Sciences, Ain Shams University, Cairo, Egypt.

Department of Information Systems, Faculty of Computer and Information Sciences, Ain Shams University, Cairo, Egypt.

出版信息

Chemometr Intell Lab Syst. 2022 May 15;224:104535. doi: 10.1016/j.chemolab.2022.104535. Epub 2022 Mar 15.

DOI:10.1016/j.chemolab.2022.104535
PMID:35308181
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8923015/
Abstract

COVID-19 disease causes serious respiratory illnesses. Therefore, accurate identification of the viral infection cycle plays a key role in designing appropriate vaccines. The risk of this disease depends on proteins that interact with human receptors. In this paper, we formulate a novel model for COVID-19 named "amino acid encoding based prediction" (AAPred). This model is accurate, classifies the various coronavirus types, and distinguishes SARS-CoV-2 from other coronaviruses. With the AAPred model, we reduce the number of features to enhance its performance by selecting the most important ones employing statistical criteria. The protein sequence of SARS-CoV-2 for understanding the viral infection cycle is analyzed. Six machine learning classifiers related to decision trees, k-nearest neighbors, random forest, support vector machine, bagging ensemble, and gradient boosting are used to evaluate the model in terms of accuracy, precision, sensitivity, and specificity. We implement the obtained results computationally and apply them to real data from the National Genomics Data Center. The experimental results report that the AAPred model reduces the features to seven of them. The average accuracy of the 10-fold cross-validation is 98.69%, precision is 98.72%, sensitivity is 96.81%, and specificity is 97.72%. The features are selected utilizing information gain and classified with random forest. The proposed model predicts the type of Coronavirus and reduces the number of extracted features. We identify that SARS-CoV-2 has similar physicochemical characteristics in some regions of SARS-CoV. Also, we report that SARS-CoV-2 has similar infection cycles and sequences in some regions of SARS CoV indicating the affectedness of vaccines on SARS-CoV-2. A comparison with deep learning shows similar results with our method.

摘要

新冠病毒病会引发严重的呼吸系统疾病。因此,准确识别病毒感染周期在设计合适的疫苗方面起着关键作用。这种疾病的风险取决于与人类受体相互作用的蛋白质。在本文中,我们构建了一种名为“基于氨基酸编码预测”(AAPred)的新型新冠病毒模型。该模型准确无误,能够对各种冠状病毒类型进行分类,并将严重急性呼吸综合征冠状病毒2(SARS-CoV-2)与其他冠状病毒区分开来。借助AAPred模型,我们通过运用统计标准选择最重要的特征来减少特征数量,以提升其性能。我们对SARS-CoV-2的蛋白质序列进行了分析,以了解病毒感染周期。使用了与决策树、k近邻、随机森林、支持向量机、装袋集成和梯度提升相关的六种机器学习分类器,从准确性、精确性、敏感性和特异性方面对该模型进行评估。我们通过计算实现了所得结果,并将其应用于国家基因组数据中心的真实数据。实验结果表明,AAPred模型将特征减少到了七个。十折交叉验证的平均准确率为98.69%,精确率为98.72%,敏感性为96.81%,特异性为97.72%。这些特征是利用信息增益进行选择的,并使用随机森林进行分类。所提出的模型能够预测冠状病毒的类型并减少提取的特征数量。我们发现SARS-CoV-2在严重急性呼吸综合征冠状病毒(SARS-CoV)的某些区域具有相似的物理化学特征。此外,我们报告称SARS-CoV-2在SARS-CoV的某些区域具有相似的感染周期和序列,这表明疫苗对SARS-CoV-2有影响。与深度学习的比较显示,我们的方法得到了相似的结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ed/8923015/e3dfef4d441b/gr7_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ed/8923015/4c658da0f8ca/gr1_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ed/8923015/c3024834675f/gr2_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ed/8923015/5fcf61314c3a/gr3_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ed/8923015/e8e763d2dda0/gr4_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ed/8923015/0446ec7a119f/gr5_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ed/8923015/5583ac0fe18a/gr6_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ed/8923015/e3dfef4d441b/gr7_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ed/8923015/4c658da0f8ca/gr1_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ed/8923015/c3024834675f/gr2_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ed/8923015/5fcf61314c3a/gr3_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ed/8923015/e8e763d2dda0/gr4_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ed/8923015/0446ec7a119f/gr5_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ed/8923015/5583ac0fe18a/gr6_lrg.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0ed/8923015/e3dfef4d441b/gr7_lrg.jpg

相似文献

1
Classifying COVID-19 based on amino acids encoding with machine learning algorithms.基于氨基酸编码,使用机器学习算法对新型冠状病毒肺炎进行分类。
Chemometr Intell Lab Syst. 2022 May 15;224:104535. doi: 10.1016/j.chemolab.2022.104535. Epub 2022 Mar 15.
2
Ensemble Machine Learning Model to Predict SARS-CoV-2 T-Cell Epitopes as Potential Vaccine Targets.用于预测作为潜在疫苗靶点的SARS-CoV-2 T细胞表位的集成机器学习模型
Diagnostics (Basel). 2021 Oct 26;11(11):1990. doi: 10.3390/diagnostics11111990.
3
Classification of SARS-CoV-2 and non-SARS-CoV-2 using machine learning algorithms.基于机器学习算法的 SARS-CoV-2 和非 SARS-CoV-2 分类。
Comput Biol Med. 2021 Sep;136:104650. doi: 10.1016/j.compbiomed.2021.104650. Epub 2021 Jul 21.
4
Prediction modelling of COVID using machine learning methods from B-cell dataset.使用来自B细胞数据集的机器学习方法对新冠病毒进行预测建模。
Results Phys. 2021 Feb;21:103813. doi: 10.1016/j.rinp.2021.103813. Epub 2021 Jan 17.
5
COVID-19 prediction based on genome similarity of human SARS-CoV-2 and bat SARS-CoV-like coronavirus.基于人类严重急性呼吸综合征冠状病毒2(SARS-CoV-2)与蝙蝠类严重急性呼吸综合征冠状病毒(SARS-CoV)基因组相似性的2019冠状病毒病(COVID-19)预测
Comput Ind Eng. 2021 Nov;161:107666. doi: 10.1016/j.cie.2021.107666. Epub 2021 Sep 8.
6
Prediction of death status on the course of treatment in SARS-COV-2 patients with deep learning and machine learning methods.利用深度学习和机器学习方法预测 SARS-CoV-2 患者治疗过程中的死亡状态。
Comput Methods Programs Biomed. 2021 Apr;201:105951. doi: 10.1016/j.cmpb.2021.105951. Epub 2021 Jan 22.
7
Rapid Triage of Children with Suspected COVID-19 Using Laboratory-Based Machine-Learning Algorithms.基于实验室的机器学习算法对疑似 COVID-19 患儿的快速分诊。
Viruses. 2023 Jul 8;15(7):1522. doi: 10.3390/v15071522.
8
A new hybrid ensemble machine-learning model for severity risk assessment and post-COVID prediction system.一种新的混合集成机器学习模型,用于严重程度风险评估和 COVID 后预测系统。
Math Biosci Eng. 2022 Apr 13;19(6):6102-6123. doi: 10.3934/mbe.2022285.
9
Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?预测模型工具能否识别 ACL 重建术后阿片类药物使用时间延长的高风险患者?
Clin Orthop Relat Res. 2020 Jul;478(7):0-1618. doi: 10.1097/CORR.0000000000001251.
10
Machine learning techniques for sequence-based prediction of viral-host interactions between SARS-CoV-2 and human proteins.基于序列的 SARS-CoV-2 与人类蛋白质之间病毒-宿主相互作用的预测的机器学习技术。
Biomed J. 2020 Oct;43(5):438-450. doi: 10.1016/j.bj.2020.08.003. Epub 2020 Sep 3.

引用本文的文献

1
Can some algorithms of machine learning identify osteoporosis patients after training and testing some clinical information about patients?一些机器学习算法能否在对患者的一些临床信息进行训练和测试后识别出骨质疏松症患者?
BMC Med Inform Decis Mak. 2025 Mar 11;25(1):127. doi: 10.1186/s12911-025-02943-7.
2
On the Use of Machine Learning Techniques and Non-Invasive Indicators for Classifying and Predicting Cardiac Disorders.关于使用机器学习技术和非侵入性指标对心脏疾病进行分类和预测
Biomedicines. 2023 Sep 22;11(10):2604. doi: 10.3390/biomedicines11102604.
3
An Epidemiological Analysis for Assessing and Evaluating COVID-19 Based on Data Analytics in Latin American Countries.

本文引用的文献

1
Overview of Explainable Artificial Intelligence for Prognostic and Health Management of Industrial Assets Based on Preferred Reporting Items for Systematic Reviews and Meta-Analyses.基于系统评价和荟萃分析首选报告项目的工业资产预后和健康管理的可解释人工智能概述。
Sensors (Basel). 2021 Dec 1;21(23):8020. doi: 10.3390/s21238020.
2
A New Approach to Predicting Cryptocurrency Returns Based on the Gold Prices with Support Vector Machines during the COVID-19 Pandemic Using Sensor-Related Data.基于新冠疫情期间传感器相关数据,利用支持向量机对黄金价格进行预测,提出一种新的加密货币回报率预测方法。
Sensors (Basel). 2021 Sep 21;21(18):6319. doi: 10.3390/s21186319.
3
基于拉丁美洲国家数据分析的新型冠状病毒肺炎评估与评价的流行病学分析
Biology (Basel). 2023 Jun 20;12(6):887. doi: 10.3390/biology12060887.
4
Crowding on public transport using smart card data during the COVID-19 pandemic: New methodology and case study in Chile.新冠疫情期间利用智能卡数据研究公共交通拥挤情况:智利的新方法与案例研究
Sustain Cities Soc. 2023 Sep;96:104712. doi: 10.1016/j.scs.2023.104712. Epub 2023 Jun 8.
5
A Combined Method for Diabetes Mellitus Diagnosis Using Deep Learning, Singular Value Decomposition, and Self-Organizing Map Approaches.一种使用深度学习、奇异值分解和自组织映射方法的糖尿病诊断组合方法。
Diagnostics (Basel). 2023 May 22;13(10):1821. doi: 10.3390/diagnostics13101821.
6
Weibull Regression and Machine Learning Survival Models: Methodology, Comparison, and Application to Biomedical Data Related to Cardiac Surgery.威布尔回归与机器学习生存模型:方法、比较及在心脏外科生物医学数据中的应用
Biology (Basel). 2023 Mar 13;12(3):442. doi: 10.3390/biology12030442.
7
Early Prediction in Classification of Cardiovascular Diseases with Machine Learning, Neuro-Fuzzy and Statistical Methods.运用机器学习、神经模糊和统计方法对心血管疾病进行分类的早期预测
Biology (Basel). 2023 Jan 11;12(1):117. doi: 10.3390/biology12010117.
8
Machine learning and automatic ARIMA/Prophet models-based forecasting of COVID-19: methodology, evaluation, and case study in SAARC countries.基于机器学习和自动ARIMA/Prophet模型的COVID-19预测:方法、评估及在南亚区域合作联盟国家的案例研究
Stoch Environ Res Risk Assess. 2023;37(1):345-359. doi: 10.1007/s00477-022-02307-x. Epub 2022 Oct 5.
9
A New Wavelet-Based Privatization Mechanism for Probability Distributions.基于小波的概率分布私有化机制新方法。
Sensors (Basel). 2022 May 14;22(10):3743. doi: 10.3390/s22103743.
Anti-COVID-19 activity of some benzofused 1,2,3-triazolesulfonamide hybrids using and analyses.
使用[具体方法]和[具体分析手段]对一些并苯并1,2,3-三唑磺酰胺杂化物的抗COVID-19活性进行研究。
Chemometr Intell Lab Syst. 2021 Oct 15;217:104421. doi: 10.1016/j.chemolab.2021.104421. Epub 2021 Sep 11.
4
COVIDomic: A multi-modal cloud-based platform for identification of risk factors associated with COVID-19 severity.COVIDomic:一个基于云的多模态平台,用于识别与 COVID-19 严重程度相关的风险因素。
PLoS Comput Biol. 2021 Jul 14;17(7):e1009183. doi: 10.1371/journal.pcbi.1009183. eCollection 2021 Jul.
5
Disjoint and Functional Principal Component Analysis for Infected Cases and Deaths Due to COVID-19 in South American Countries with Sensor-Related Data.基于传感器相关数据的南美洲国家新冠病毒感染病例和死亡人数的不相交和功能主成分分析。
Sensors (Basel). 2021 Jun 14;21(12):4094. doi: 10.3390/s21124094.
6
Cross-Predicting Essential Genes between Two Model Eukaryotic Species Using Machine Learning.使用机器学习在两种模式真核生物之间交叉预测必需基因。
Int J Mol Sci. 2021 May 11;22(10):5056. doi: 10.3390/ijms22105056.
7
Knowledge Discovery for Higher Education Student Retention Based on Data Mining: Machine Learning Algorithms and Case Study in Chile.基于数据挖掘的高等教育学生保留率知识发现:机器学习算法及智利的案例研究
Entropy (Basel). 2021 Apr 20;23(4):485. doi: 10.3390/e23040485.
8
Structure and Expression of Large (+)RNA Genomes of Viruses of Higher Eukaryotes.高等真核生物病毒的(+)RNA 基因组的结构与表达。
Biochemistry (Mosc). 2021 Mar;86(3):248-261. doi: 10.1134/S0006297921030020.
9
Coronavirus biology and replication: implications for SARS-CoV-2.冠状病毒的生物学与复制:对 SARS-CoV-2 的启示。
Nat Rev Microbiol. 2021 Mar;19(3):155-170. doi: 10.1038/s41579-020-00468-6. Epub 2020 Oct 28.
10
A diagnostic model for coronavirus disease 2019 (COVID-19) based on radiological semantic and clinical features: a multi-center study.基于放射学语义和临床特征的 2019 年冠状病毒病(COVID-19)诊断模型:一项多中心研究。
Eur Radiol. 2020 Sep;30(9):4893-4902. doi: 10.1007/s00330-020-06829-2. Epub 2020 Apr 16.