Suppr超能文献

考虑到配对样本中加权K近邻插补法引起的依赖性,这是受一项结直肠癌研究的启发。

Accounting for dependence induced by weighted KNN imputation in paired samples, motivated by a colorectal cancer study.

作者信息

Suyundikov Anvar, Stevens John R, Corcoran Christopher, Herrick Jennifer, Wolff Roger K, Slattery Martha L

机构信息

Department of Mathematics and Statistics, Utah State University, 3900 Old Main Hill, Logan, UT 84322-3900, U.S.A.

Division of Epidemiology, Department of Internal Medicine, University of Utah School of Medicine, 383 Colorow Road, Salt Lake City, UT 84108, U.S.A.

出版信息

PLoS One. 2015 Apr 7;10(4):e0119876. doi: 10.1371/journal.pone.0119876. eCollection 2015.

Abstract

Missing data can arise in bioinformatics applications for a variety of reasons, and imputation methods are frequently applied to such data. We are motivated by a colorectal cancer study where miRNA expression was measured in paired tumor-normal samples of hundreds of patients, but data for many normal samples were missing due to lack of tissue availability. We compare the precision and power performance of several imputation methods, and draw attention to the statistical dependence induced by K-Nearest Neighbors (KNN) imputation. This imputation-induced dependence has not previously been addressed in the literature. We demonstrate how to account for this dependence, and show through simulation how the choice to ignore or account for this dependence affects both power and type I error rate control.

摘要

在生物信息学应用中,缺失数据可能由于多种原因出现,插补方法经常应用于此类数据。我们受到一项结直肠癌研究的启发,该研究在数百名患者的配对肿瘤-正常样本中测量了miRNA表达,但由于缺乏组织样本,许多正常样本的数据缺失。我们比较了几种插补方法的精度和功效性能,并提请注意K近邻(KNN)插补引起的统计依赖性。这种由插补引起的依赖性在以前的文献中尚未得到解决。我们演示了如何考虑这种依赖性,并通过模拟展示忽略或考虑这种依赖性的选择如何影响功效和I型错误率控制。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/79d1/4388652/d5a1be7a4901/pone.0119876.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验