Suppr超能文献

使用基因组序列处理模型进行疾病识别的研究与分析:实证综述

A Study and Analysis of Disease Identification using Genomic Sequence Processing Models: An Empirical Review.

作者信息

Ahuja Sony K, Shrimankar Deepti D, Durge Aditi R

机构信息

Visvesvaraya National Institute of Technology, Computer Science and Engineering, India.

出版信息

Curr Genomics. 2023 Dec 12;24(4):207-235. doi: 10.2174/0113892029269523231101051455.

Abstract

Human gene sequences are considered a primary source of comprehensive information about different body conditions. A wide variety of diseases including cancer, heart issues, brain issues, genetic issues, . can be pre-empted efficient analysis of genomic sequences. Researchers have proposed different configurations of machine learning models for processing genomic sequences, and each of these models varies in terms of their performance & applicability characteristics. Models that use bioinspired optimizations are generally slower, but have superior incremental-performance, while models that use one-shot learning achieve higher instantaneous accuracy but cannot be scaled for larger disease-sets. Due to such variations, it is difficult for genomic system designers to identify optimum models for their application-specific & performance-specific use cases. To overcome this issue, a detailed survey of different genomic processing models in terms of their functional nuances, application-specific advantages, deployment-specific limitations, and contextual future scopes is discussed in this text. Based on this discussion, researchers will be able to identify optimal models for their functional use cases. This text also compares the reviewed models in terms of their quantitative parameter sets, which include, the accuracy of classification, delay needed to classify large-length sequences, precision levels, scalability levels, and deployment cost, which will assist readers in selecting deployment-specific models for their contextual clinical scenarios. This text also evaluates a novel Genome Processing Efficiency Rank (GPER) for each of these models, which will allow readers to identify models with higher performance and low overheads under real-time scenarios.

摘要

人类基因序列被视为有关不同身体状况的全面信息的主要来源。包括癌症、心脏问题、脑部问题、遗传问题等各种各样的疾病,都可以通过对基因组序列的有效分析来预先防范。研究人员已经提出了不同配置的机器学习模型来处理基因组序列,并且这些模型中的每一个在其性能和适用性特征方面都有所不同。使用生物启发式优化的模型通常速度较慢,但具有卓越的增量性能,而使用一次性学习的模型能实现更高的瞬时准确率,但无法扩展到更大的疾病集。由于存在这些差异,基因组系统设计人员很难为其特定应用和性能特定的用例确定最佳模型。为了克服这个问题,本文讨论了对不同基因组处理模型在其功能细微差别、特定应用优势、特定部署限制以及上下文未来范围方面的详细调查。基于此讨论,研究人员将能够为其功能用例确定最佳模型。本文还根据所审查模型的定量参数集对它们进行了比较,这些参数集包括分类准确率、对长序列进行分类所需的延迟、精确水平、可扩展性水平以及部署成本,这将帮助读者为其上下文临床场景选择特定部署的模型。本文还为这些模型中的每一个评估了一种新颖的基因组处理效率排名(GPER),这将使读者能够识别在实时场景下具有更高性能和低开销的模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82ad/10758128/d0bbccb745c5/CG-24-207_F1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验