Suppr超能文献

一种用于基于基因表达的计算机辅助诊断系统的生物标志物检测的新型计算方法。

A Novel Computational Approach for Biomarker Detection for Gene Expression-Based Computer-Aided Diagnostic Systems for Breast Cancer.

机构信息

Department of Computer Science, Faculty of Computer and Information Technology, Jerash University, Jerash, Jordan.

Complex Systems, Big Data and Informatics Initiative (CSBII), Lincoln University, Christchurch, New Zealand.

出版信息

Methods Mol Biol. 2021;2190:195-208. doi: 10.1007/978-1-0716-0826-5_9.

Abstract

Cancer produces complex cellular changes. Microarrays have become crucial to identifying genes involved in causing these changes; however, microarray data analysis is challenged by the high-dimensionality of data compared to the number of samples. This has contributed to inconsistent cancer biomarkers from various gene expression studies. Also, identification of crucial genes in cancer can be expedited through expression profiling of peripheral blood cells. We introduce a novel feature selection method for microarrays involving a two-step filtering process to select a minimum set of genes with greater consistency and relevance, and demonstrate that the selected gene set considerably enhances the diagnostic accuracy of cancer. The preliminary filtering (Bi-biological filter) involves building gene coexpression networks for cancer and healthy conditions using a topological overlap matrix (TOM) and finding cancer specific gene clusters using Spectral Clustering (SC). This is followed by a filtering step to extract a much-reduced set of crucial genes using best first search with support vector machine (BFS-SVM). Finally, artificial neural networks, SVM, and K-nearest neighbor classifiers are used to assess the predictive power of the selected genes as well as to select the most effective diagnostic system. The approach was applied to peripheral blood profiling for breast cancer where Bi-biological filter selected 415 biologically consistent genes, from which BFS-SVM extracted 13 highly cancer specific genes for breast cancer identification. ANN was the superior classifier with 93.2% classification accuracy, a 14% improvement over the study from which data were obtained for this study (Aaroe et al., Breast Cancer Res 12:R7, 2010).

摘要

癌症会引起复杂的细胞变化。微阵列已成为鉴定导致这些变化的相关基因的关键手段;然而,与样本数量相比,微阵列数据的分析受到数据高维性的挑战。这导致了来自不同基因表达研究的不一致的癌症生物标志物。此外,通过对外周血细胞进行表达谱分析,也可以加速识别癌症中的关键基因。我们提出了一种新的微阵列特征选择方法,该方法涉及两步过滤过程,以选择具有更高一致性和相关性的最小基因集,并证明所选基因集可显著提高癌症的诊断准确性。初步过滤(Bi-生物学过滤)涉及使用拓扑重叠矩阵 (TOM) 为癌症和健康条件构建基因共表达网络,并使用谱聚类 (SC) 找到癌症特异性基因簇。然后,使用支持向量机的最佳优先搜索 (BFS-SVM) 进行过滤步骤,以提取更精简的关键基因集。最后,使用人工神经网络、SVM 和 K-最近邻分类器来评估所选基因的预测能力,并选择最有效的诊断系统。该方法应用于乳腺癌的外周血分析,Bi-生物学过滤器选择了 415 个具有生物学一致性的基因,其中 BFS-SVM 从乳腺癌中提取了 13 个高度特异性的基因用于乳腺癌识别。ANN 是分类准确率为 93.2%的优秀分类器,比用于本研究的数据的研究(Aaroe 等人,乳腺癌研究 12:R7,2010)提高了 14%。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验