Suppr超能文献

Wx:一种基于神经网络的转录组学数据特征选择算法。

Wx: a neural network-based feature selection algorithm for transcriptomic data.

机构信息

Deargen Inc., Daejeon, Republic of Korea.

Department of Computer Science, Emory University, Atlanta, GA, 30322, USA.

出版信息

Sci Rep. 2019 Jul 19;9(1):10500. doi: 10.1038/s41598-019-47016-8.

Abstract

Next-generation sequencing (NGS), which allows the simultaneous sequencing of billions of DNA fragments simultaneously, has revolutionized how we study genomics and molecular biology by generating genome-wide molecular maps of molecules of interest. However, the amount of information produced by NGS has made it difficult for researchers to choose the optimal set of genes. We have sought to resolve this issue by developing a neural network-based feature (gene) selection algorithm called Wx. The Wx algorithm ranks genes based on the discriminative index (DI) score that represents the classification power for distinguishing given groups. With a gene list ranked by DI score, researchers can institutively select the optimal set of genes from the highest-ranking ones. We applied the Wx algorithm to a TCGA pan-cancer gene-expression cohort to identify an optimal set of gene-expression biomarker candidates that can distinguish cancer samples from normal samples for 12 different types of cancer. The 14 gene-expression biomarker candidates identified by Wx were comparable to or outperformed previously reported universal gene expression biomarkers, highlighting the usefulness of the Wx algorithm for next-generation sequencing data. Thus, we anticipate that the Wx algorithm can complement current state-of-the-art analytical applications for the identification of biomarker candidates as an alternative method. The stand-alone and web versions of the Wx algorithm are available at https://github.com/deargen/DearWXpub and https://wx.deargendev.me/ , respectively.

摘要

下一代测序(NGS)能够同时对数十亿个 DNA 片段进行测序,通过生成感兴趣分子的全基因组分子图谱,彻底改变了我们研究基因组学和分子生物学的方式。然而,NGS 产生的信息量之大,使得研究人员难以选择最佳的基因集。为了解决这个问题,我们开发了一种基于神经网络的特征(基因)选择算法,称为 Wx。Wx 算法根据判别指数(DI)评分对基因进行排序,该评分代表区分给定组别的分类能力。通过按 DI 评分排序的基因列表,研究人员可以从排名最高的基因中选择最佳的基因集。我们将 Wx 算法应用于 TCGA 泛癌基因表达队列,以确定一组可区分 12 种不同类型癌症的癌症样本和正常样本的最佳基因表达生物标志物候选基因。Wx 确定的 14 个基因表达生物标志物候选基因与先前报道的通用基因表达生物标志物相当或优于,突出了 Wx 算法在下一代测序数据中的有用性。因此,我们预计 Wx 算法可以作为替代方法,补充当前用于识别生物标志物候选物的最先进分析应用程序。Wx 算法的独立版本和网络版本分别可在 https://github.com/deargen/DearWXpubhttps://wx.deargendev.me/ 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90c6/6642261/dc8a9563ea70/41598_2019_47016_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验