Suppr超能文献

MILE基因表达数据集的综合分析推动白血病类型和亚型生物标志物的发现。

Comprehensive Analysis of MILE Gene Expression Data Set Advances Discovery of Leukaemia Type and Subtype Biomarkers.

作者信息

Labaj Wojciech, Papiez Anna, Polanski Andrzej, Polanska Joanna

机构信息

Silesian University of Technology, Institute of Informatics, Akademicka 16, 44-100, Gliwice, Poland.

Silesian University of Technology, Institute of Automatic Control, Akademicka 16, 44-100, Gliwice, Poland.

出版信息

Interdiscip Sci. 2017 Mar;9(1):24-35. doi: 10.1007/s12539-017-0216-9. Epub 2017 Mar 16.

Abstract

Large collections of data in studies on cancer such as leukaemia provoke the necessity of applying tailored analysis algorithms to ensure supreme information extraction. In this work, a custom-fit pipeline is demonstrated for thorough investigation of the voluminous MILE gene expression data set. Three analyses are accomplished, each for gaining a deeper understanding of the processes underlying leukaemia types and subtypes. First, the main disease groups are tested for differential expression against the healthy control as in a standard case-control study. Here, the basic knowledge on molecular mechanisms is confirmed quantitatively and by literature references. Second, pairwise comparison testing is performed for juxtaposing the main leukaemia types among each other. In this case by means of the Dice coefficient similarity measure the general relations are pointed out. Moreover, lists of candidate main leukaemia group biomarkers are proposed. Finally, with this approach being successful, the third analysis provides insight into all of the studied subtypes, followed by the emergence of four leukaemia subtype biomarkers. In addition, the class enhanced DEG signature obtained on the basis of novel pipeline processing leads to significantly better classification power of multi-class data classifiers. The developed methodology consisting of batch effect adjustment, adaptive noise and feature filtration coupled with adequate statistical testing and biomarker definition proves to be an effective approach towards knowledge discovery in high-throughput molecular biology experiments.

摘要

在白血病等癌症研究中的大量数据,促使人们需要应用定制的分析算法来确保最大限度地提取信息。在这项工作中,展示了一种定制的流程,用于深入研究庞大的MILE基因表达数据集。完成了三项分析,每项分析都是为了更深入地了解白血病类型和亚型背后的过程。首先,像在标准的病例对照研究中那样,对主要疾病组与健康对照进行差异表达测试。在这里,通过文献参考对分子机制的基本知识进行了定量确认。其次,进行成对比较测试,以相互并列主要白血病类型。在这种情况下,通过骰子系数相似性度量指出了一般关系。此外,还提出了候选主要白血病组生物标志物列表。最后,由于这种方法取得了成功,第三次分析深入研究了所有研究的亚型,随后出现了四种白血病亚型生物标志物。此外,基于新颖的流程处理获得的类增强差异表达基因特征导致多类数据分类器的分类能力显著提高。所开发的方法包括批次效应调整、自适应噪声和特征过滤,再加上适当的统计测试和生物标志物定义,被证明是在高通量分子生物学实验中进行知识发现的有效方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1095/5366179/c7832176c436/12539_2017_216_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验