评估用于识别癌症驱动基因的机器学习方法。

Evaluating machine learning methodologies for identification of cancer driver genes.

机构信息

Department of Information Technology, Faculty of Computing and Information Technology, King Abdulaziz University, P.O. Box 344, Rabigh, 21911, Saudi Arabia.

Department of Computer Science, School of Systems and Technology, University of Management and Technology, Lahore, Pakistan.

出版信息

Sci Rep. 2021 Jun 10;11(1):12281. doi: 10.1038/s41598-021-91656-8.

DOI:10.1038/s41598-021-91656-8

PMID:34112883

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8192921/

Abstract

Cancer is driven by distinctive sorts of changes and basic variations in genes. Recognizing cancer driver genes is basic for accurate oncological analysis. Numerous methodologies to distinguish and identify drivers presently exist, but efficient tools to combine and optimize them on huge datasets are few. Most strategies for prioritizing transformations depend basically on frequency-based criteria. Strategies are required to dependably prioritize organically dynamic driver changes over inert passengers in high-throughput sequencing cancer information sets. This study proposes a model namely PCDG-Pred which works as a utility capable of distinguishing cancer driver and passenger attributes of genes based on sequencing data. Keeping in view the significance of the cancer driver genes an efficient method is proposed to identify the cancer driver genes. Further, various validation techniques are applied at different levels to establish the effectiveness of the model and to obtain metrics like accuracy, Mathew's correlation coefficient, sensitivity, and specificity. The results of the study strongly indicate that the proposed strategy provides a fundamental functional advantage over other existing strategies for cancer driver genes identification. Subsequently, careful experiments exhibit that the accuracy metrics obtained for self-consistency, independent set, and cross-validation tests are 91.08%., 87.26%, and 92.48% respectively.

摘要

癌症是由基因的独特变化和基本变异驱动的。识别癌症驱动基因是准确进行肿瘤分析的基础。目前已经存在许多识别和鉴定驱动基因的方法，但在大规模数据集上有效地组合和优化它们的有效工具却很少。大多数优先考虑转化的策略主要基于基于频率的标准。需要有策略能够可靠地将高通量测序癌症信息集中的有机动态驱动变化优先于惰性乘客。本研究提出了一种名为 PCDG-Pred 的模型，它可以根据测序数据区分基因的癌症驱动和乘客属性。鉴于癌症驱动基因的重要性，提出了一种有效的方法来识别癌症驱动基因。此外，还在不同层次上应用了各种验证技术来建立模型的有效性，并获得准确性、马修相关系数、灵敏度和特异性等指标。研究结果强烈表明，与其他现有的癌症驱动基因识别策略相比，该策略提供了基本的功能优势。随后，仔细的实验表明，自我一致性、独立集和交叉验证测试的准确性指标分别为 91.08%、87.26%和 92.48%。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

评估用于识别癌症驱动基因的机器学习方法。

Evaluating machine learning methodologies for identification of cancer driver genes.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

评估用于识别癌症驱动基因的机器学习方法。

Evaluating machine learning methodologies for identification of cancer driver genes.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献