Suppr超能文献

高维度数据集成

Data integration with high dimensionality.

作者信息

Gao Xin, Carroll Raymond J

机构信息

Department of Mathematics and Statistics, York University, 4700 Keele Street, Toronto, Ontario M3J 1P3, Canada.

Department of Statistics, 447 Blocker Building, Texas A&M University, College Station, Texas 77843, U.S.A.

出版信息

Biometrika. 2017 Jun;104(2):251-272. doi: 10.1093/biomet/asx023. Epub 2017 May 9.

Abstract

We consider situations where the data consist of a number of responses for each individual, which may include a mix of discrete and continuous variables. The data also include a class of predictors, where the same predictor may have different physical measurements across different experiments depending on how the predictor is measured. The goal is to select which predictors affect any of the responses, where the number of such informative predictors tends to infinity as the sample size increases. There are marginal likelihoods for each experiment; we specify a pseudolikelihood combining the marginal likelihoods, and propose a pseudolikelihood information criterion. Under regularity conditions, we establish selection consistency for this criterion with unbounded true model size. The proposed method includes a Bayesian information criterion with appropriate penalty term as a special case. Simulations indicate that data integration can dramatically improve upon using only one data source.

摘要

我们考虑这样的情况

数据由每个个体的多个响应组成,这些响应可能包括离散变量和连续变量的混合。数据还包括一类预测变量,其中相同的预测变量在不同实验中可能有不同的物理测量值,这取决于预测变量的测量方式。目标是选择哪些预测变量会影响任何响应,随着样本量的增加,此类信息性预测变量的数量趋于无穷大。每个实验都有边际似然;我们指定一个结合边际似然的伪似然,并提出一个伪似然信息准则。在正则条件下,我们建立了该准则对于无界真实模型大小的选择一致性。所提出的方法包括一个带有适当惩罚项的贝叶斯信息准则作为特殊情况。模拟表明,数据整合比仅使用一个数据源能显著改进。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf3e/5793676/b98df9678f43/asx023f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验