• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Learning vector quantization as an interpretable classifier for the detection of SARS-CoV-2 types based on their RNA sequences.学习向量量化作为一种基于RNA序列检测新冠病毒类型的可解释分类器。
Neural Comput Appl. 2022;34(1):67-78. doi: 10.1007/s00521-021-06018-2. Epub 2021 Apr 27.
2
The Resolved Mutual Information Function as a Structural Fingerprint of Biomolecular Sequences for Interpretable Machine Learning Classifiers.作为可解释机器学习分类器的生物分子序列结构指纹的解析互信息函数
Entropy (Basel). 2021 Oct 17;23(10):1357. doi: 10.3390/e23101357.
3
Taxonium, a web-based tool for exploring large phylogenetic trees.Taxonium,一个用于探索大型系统发育树的网络工具。
Elife. 2022 Nov 15;11:e82392. doi: 10.7554/eLife.82392.
4
Toward improving the performance of learning by joining feature selection and ensemble classification techniques: an application for cancer diagnosis.为了提高学习性能,结合特征选择和集成分类技术:在癌症诊断中的应用。
J Cancer Res Clin Oncol. 2023 Dec;149(19):16993-17006. doi: 10.1007/s00432-023-05422-6. Epub 2023 Sep 23.
5
A prospective, randomized, single-blinded, crossover trial to investigate the effect of a wearable device in addition to a daily symptom diary for the Remote Early Detection of SARS-CoV-2 infections (COVID-RED): a structured summary of a study protocol for a randomized controlled trial.一项前瞻性、随机、单盲、交叉试验,旨在研究可穿戴设备对 SARS-CoV-2 感染(COVID-RED)的远程早期检测的影响:一项随机对照试验研究方案的结构化总结。
Trials. 2021 Oct 11;22(1):694. doi: 10.1186/s13063-021-05643-5.
6
LVQ-KNN: Composition-based DNA/RNA binning of short nucleotide sequences utilizing a prototype-based k-nearest neighbor approach.LVQ-KNN:基于原型的 k-最近邻方法的基于组合的短核苷酸序列 DNA/RNA 分箱。
Virus Res. 2018 Oct 15;258:55-63. doi: 10.1016/j.virusres.2018.10.002. Epub 2018 Oct 4.
7
A prospective, randomized, single-blinded, crossover trial to investigate the effect of a wearable device in addition to a daily symptom diary for the remote early detection of SARS-CoV-2 infections (COVID-RED): a structured summary of a study protocol for a randomized controlled trial.一项前瞻性、随机、单盲、交叉试验,旨在研究可穿戴设备对远程早期检测 SARS-CoV-2 感染(COVID-RED)的影响:一项随机对照试验研究方案的结构化总结。
Trials. 2021 Jun 22;22(1):412. doi: 10.1186/s13063-021-05241-5.
8
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
9
Quantum Computing Approaches for Vector Quantization-Current Perspectives and Developments.用于矢量量化的量子计算方法——当前观点与进展
Entropy (Basel). 2023 Mar 21;25(3):540. doi: 10.3390/e25030540.
10
A new profiling approach for DNA sequences based on the nucleotides' physicochemical features for accurate analysis of SARS-CoV-2 genomes.一种基于核苷酸理化特征的 DNA 序列新分析方法,可准确分析 SARS-CoV-2 基因组。
BMC Genomics. 2023 May 18;24(1):266. doi: 10.1186/s12864-023-09373-7.

引用本文的文献

1
Utilizing genomic signatures to gain insights into the dynamics of SARS-CoV-2 through Machine and Deep Learning techniques.利用基因组特征,通过机器学习和深度学习技术深入了解 SARS-CoV-2 的动态。
BMC Bioinformatics. 2024 Mar 27;25(1):131. doi: 10.1186/s12859-024-05648-2.
2
Unsupervised identification of significant lineages of SARS-CoV-2 through scalable machine learning methods.通过可扩展的机器学习方法对 SARS-CoV-2 的重要谱系进行无监督识别。
Proc Natl Acad Sci U S A. 2024 Mar 19;121(12):e2317284121. doi: 10.1073/pnas.2317284121. Epub 2024 Mar 13.
3
Towards Efficient and Accurate SARS-CoV-2 Genome Sequence Typing Based on Supervised Learning Approaches.基于监督学习方法实现高效准确的新型冠状病毒基因组序列分型
Microorganisms. 2022 Sep 4;10(9):1785. doi: 10.3390/microorganisms10091785.
4
The Resolved Mutual Information Function as a Structural Fingerprint of Biomolecular Sequences for Interpretable Machine Learning Classifiers.作为可解释机器学习分类器的生物分子序列结构指纹的解析互信息函数
Entropy (Basel). 2021 Oct 17;23(10):1357. doi: 10.3390/e23101357.
5
A self-organizing world: special issue of the 13th edition of the workshop on self-organizing maps and learning vector quantization, clustering and data visualization, WSOM + 2019.一个自组织的世界:第13届自组织映射与学习向量量化、聚类和数据可视化研讨会(WSOM + 2019)特刊
Neural Comput Appl. 2022;34(1):1-3. doi: 10.1007/s00521-021-06307-w. Epub 2021 Jul 19.

本文引用的文献

1
Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead.停止为高风险决策解释黑箱机器学习模型,转而使用可解释模型。
Nat Mach Intell. 2019 May;1(5):206-215. doi: 10.1038/s42256-019-0048-x. Epub 2019 May 13.
2
Population Genomics Insights into the First Wave of COVID-19.新冠疫情第一波的群体基因组学见解
Life (Basel). 2021 Feb 7;11(2):129. doi: 10.3390/life11020129.
3
Comprehensive evolution and molecular characteristics of a large number of SARS-CoV-2 genomes reveal its epidemic trends.大量 SARS-CoV-2 基因组的综合进化和分子特征揭示了其流行趋势。
Int J Infect Dis. 2020 Nov;100:164-173. doi: 10.1016/j.ijid.2020.08.066. Epub 2020 Aug 28.
4
Rapid, Sensitive, Full-Genome Sequencing of Severe Acute Respiratory Syndrome Coronavirus 2.快速、灵敏、全基因组测序严重急性呼吸综合征冠状病毒 2。
Emerg Infect Dis. 2020 Oct;26(10):2401-2405. doi: 10.3201/eid2610.201800. Epub 2020 Jul 1.
5
Genotyping coronavirus SARS-CoV-2: methods and implications.冠状病毒 SARS-CoV-2 的基因分型:方法与意义。
Genomics. 2020 Sep;112(5):3588-3596. doi: 10.1016/j.ygeno.2020.04.016. Epub 2020 Apr 27.
6
The proximal origin of SARS-CoV-2.严重急性呼吸综合征冠状病毒2(SARS-CoV-2)的近端起源。
Nat Med. 2020 Apr;26(4):450-452. doi: 10.1038/s41591-020-0820-9.
7
Phylogenetic network analysis of SARS-CoV-2 genomes.SARS-CoV-2 基因组的系统发育网络分析。
Proc Natl Acad Sci U S A. 2020 Apr 28;117(17):9241-9243. doi: 10.1073/pnas.2004999117. Epub 2020 Apr 8.
8
Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor.SARS-CoV-2 刺突受体结合域与 ACE2 受体复合物的结构。
Nature. 2020 May;581(7807):215-220. doi: 10.1038/s41586-020-2180-5. Epub 2020 Mar 30.
9
Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study.实时预测和预报源自中国武汉的 2019-nCoV 疫情在国内和国际的潜在传播:一项建模研究。
Lancet. 2020 Feb 29;395(10225):689-697. doi: 10.1016/S0140-6736(20)30260-9. Epub 2020 Jan 31.
10
Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding.新冠病毒的基因组特征和流行病学:对病毒起源和受体结合的影响。
Lancet. 2020 Feb 22;395(10224):565-574. doi: 10.1016/S0140-6736(20)30251-8. Epub 2020 Jan 30.

学习向量量化作为一种基于RNA序列检测新冠病毒类型的可解释分类器。

Learning vector quantization as an interpretable classifier for the detection of SARS-CoV-2 types based on their RNA sequences.

作者信息

Kaden Marika, Bohnsack Katrin Sophie, Weber Mirko, Kudła Mateusz, Gutowska Kaja, Blazewicz Jacek, Villmann Thomas

机构信息

University of Applied Sciences Mittweida, Technikumplatz 17, 09648 Mittweida, Germany.

Saxon Institute for Computational Intelligence and Machine Learning, Technikumplatz 17, 09648 Mittweida, Germany.

出版信息

Neural Comput Appl. 2022;34(1):67-78. doi: 10.1007/s00521-021-06018-2. Epub 2021 Apr 27.

DOI:10.1007/s00521-021-06018-2
PMID:33935376
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8076884/
Abstract

UNLABELLED

We present an approach to discriminate SARS-CoV-2 virus types based on their RNA sequence descriptions avoiding a sequence alignment. For that purpose, sequences are preprocessed by feature extraction and the resulting feature vectors are analyzed by prototype-based classification to remain interpretable. In particular, we propose to use variants of learning vector quantization (LVQ) based on dissimilarity measures for RNA sequence data. The respective matrix LVQ provides additional knowledge about the classification decisions like discriminant feature correlations and, additionally, can be equipped with easy to realize reject options for uncertain data. Those options provide self-controlled evidence, i.e., the model refuses to make a classification decision if the model evidence for the presented data is not sufficient. This model is first trained using a GISAID dataset with given virus types detected according to the molecular differences in coronavirus populations by phylogenetic tree clustering. In a second step, we apply the trained model to another but unlabeled SARS-CoV-2 virus dataset. For these data, we can either assign a virus type to the sequences or reject atypical samples. Those rejected sequences allow to speculate about new virus types with respect to nucleotide base mutations in the viral sequences. Moreover, this rejection analysis improves model robustness. Last but not least, the presented approach has lower computational complexity compared to methods based on (multiple) sequence alignment.

SUPPLEMENTARY INFORMATION

The online version contains supplementary material available at 10.1007/s00521-021-06018-2.

摘要

未标注

我们提出了一种基于RNA序列描述来区分严重急性呼吸综合征冠状病毒2(SARS-CoV-2)病毒类型的方法,无需进行序列比对。为此,通过特征提取对序列进行预处理,并通过基于原型的分类对所得特征向量进行分析,以保持可解释性。特别是,我们建议基于RNA序列数据的差异度量使用学习向量量化(LVQ)的变体。相应的矩阵LVQ提供了关于分类决策的额外知识,如判别特征相关性,此外,还可以配备易于实现的不确定数据拒绝选项。这些选项提供了自我控制的证据,即如果模型对所呈现数据的证据不足,模型将拒绝做出分类决策。该模型首先使用全球共享流感数据倡议组织(GISAID)数据集进行训练,该数据集根据系统发育树聚类在冠状病毒群体中的分子差异检测出给定的病毒类型。在第二步中,我们将训练好的模型应用于另一个未标记的SARS-CoV-2病毒数据集。对于这些数据,我们可以为序列指定病毒类型或拒绝非典型样本。那些被拒绝的序列有助于根据病毒序列中的核苷酸碱基突变推测新的病毒类型。此外,这种拒绝分析提高了模型的鲁棒性。最后但同样重要的是,与基于(多重)序列比对的方法相比,所提出的方法具有更低的计算复杂度。

补充信息

在线版本包含可在10.1007/s00521-021-06018-2获取的补充材料。