Suppr超能文献

跨平台选择可靠的mRNA表达测量方法可改善下游分析。

Selecting Reliable mRNA Expression Measurements Across Platforms Improves Downstream Analysis.

作者信息

Tong Pan, Diao Lixia, Shen Li, Li Lerong, Heymach John Victor, Girard Luc, Minna John D, Coombes Kevin R, Byers Lauren Averett, Wang Jing

机构信息

Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.

Department of Thoracic and Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.

出版信息

Cancer Inform. 2016 May 10;15:81-9. doi: 10.4137/CIN.S38590. eCollection 2016.

Abstract

With increasing use of publicly available gene expression data sets, the quality of the expression data is a critical issue for downstream analysis, gene signature development, and cross-validation of data sets. Thus, identifying reliable expression measurements by leveraging multiple mRNA expression platforms is an important analytical task. In this study, we propose a statistical framework for selecting reliable measurements between platforms by modeling the correlations of mRNA expression levels using a beta-mixture model. The model-based selection provides an effective and objective way to separate good probes from probes with low quality, thereby improving the efficiency and accuracy of the analysis. The proposed method can be used to compare two microarray technologies or microarray and RNA sequencing measurements. We tested the approach in two matched profiling data sets, using microarray gene expression measurements from the same samples profiled on both Affymetrix and Illumina platforms. We also applied the algorithm to mRNA expression data to compare Affymetrix microarray data with RNA sequencing measurements. The algorithm successfully identified probes/genes with reliable measurements. Removing the unreliable measurements resulted in significant improvements for gene signature development and functional annotations.

摘要

随着公开可用基因表达数据集的使用日益增加,表达数据的质量对于下游分析、基因特征开发以及数据集的交叉验证而言是一个关键问题。因此,通过利用多个mRNA表达平台来识别可靠的表达测量值是一项重要的分析任务。在本研究中,我们提出了一个统计框架,通过使用β混合模型对mRNA表达水平的相关性进行建模,来在不同平台之间选择可靠的测量值。基于模型的选择提供了一种有效且客观的方法,将优质探针与低质量探针区分开来,从而提高分析的效率和准确性。所提出的方法可用于比较两种微阵列技术或微阵列与RNA测序测量值。我们在两个匹配的分析数据集中测试了该方法,使用了在Affymetrix和Illumina平台上对相同样本进行的微阵列基因表达测量值。我们还将该算法应用于mRNA表达数据,以比较Affymetrix微阵列数据与RNA测序测量值。该算法成功识别出具有可靠测量值的探针/基因。去除不可靠的测量值后,基因特征开发和功能注释有了显著改善。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c2ba/4863871/7be9bd1b3281/cin-15-2016-081f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验