• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过凸优化对大量[公式:见文本]值集合进行加权挖掘。

Weighted mining of massive collections of [Formula: see text]-values by convex optimization.

作者信息

Dobriban Edgar

机构信息

Department of Statistics, The Wharton School, University of Pennsylania, USA.

出版信息

Inf inference. 2018 Jun;7(2):251-275. doi: 10.1093/imaiai/iax013. Epub 2017 Dec 8.

DOI:10.1093/imaiai/iax013
PMID:29930799
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5998655/
Abstract

Researchers in data-rich disciplines-think of computational genomics and observational cosmology-often wish to mine large bodies of [Formula: see text]-values looking for significant effects, while controlling the false discovery rate or family-wise error rate. Increasingly, researchers also wish to prioritize certain hypotheses, for example, those thought to have larger effect sizes, by upweighting, and to impose constraints on the underlying mining, such as monotonicity along a certain sequence. We introduce , a principled method for performing weighted multiple testing by constrained convex optimization. Our method elegantly allows one to prioritize certain hypotheses through upweighting and to discount others through downweighting, while constraining the underlying weights involved in the mining process. When the [Formula: see text]-values derive from monotone likelihood ratio families such as the Gaussian means model, the new method allows exact solution of an important optimal weighting problem previously thought to be non-convex and computationally infeasible. Our method scales to massive data set sizes. We illustrate the applications of Princessp on a series of standard genomics data sets and offer comparisons with several previous 'standard' methods. Princessp offers both ease of operation and the ability to scale to extremely large problem sizes. The method is available as open-source software from github.com/dobriban/pvalue_weighting_matlab (accessed 11 October 2017).

摘要

数据丰富学科(如计算基因组学和观测宇宙学)的研究人员常常希望挖掘大量的P值,寻找显著效应,同时控制错误发现率或族系错误率。越来越多的研究人员还希望通过加权来优先考虑某些假设,例如那些被认为具有较大效应量的假设,并对基础挖掘施加约束,比如沿特定序列的单调性。我们引入了Princessp,一种通过约束凸优化进行加权多重检验的原则性方法。我们的方法巧妙地允许通过加权来优先考虑某些假设,并通过减权来淡化其他假设,同时约束挖掘过程中涉及的基础权重。当P值来自单调似然比族(如高斯均值模型)时,新方法能够精确求解一个以前被认为是非凸且计算上不可行的重要最优加权问题。我们的方法能够扩展到海量数据集规模。我们在一系列标准基因组学数据集上展示了Princessp的应用,并与之前的几种“标准”方法进行了比较。Princessp既易于操作,又具备扩展到极大问题规模的能力。该方法可从github.com/dobriban/pvalue_weighting_matlab获取开源软件(于2017年10月11日访问)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc02/5998655/f71bc6f0bc05/iax013f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc02/5998655/36652eddc39c/iax013f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc02/5998655/9dd5f73bbf4b/iax013f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc02/5998655/837f854f40ca/iax013f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc02/5998655/b852f27f616e/iax013a1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc02/5998655/d6ed28fea237/iax013f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc02/5998655/f71bc6f0bc05/iax013f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc02/5998655/36652eddc39c/iax013f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc02/5998655/9dd5f73bbf4b/iax013f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc02/5998655/837f854f40ca/iax013f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc02/5998655/b852f27f616e/iax013a1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc02/5998655/d6ed28fea237/iax013f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc02/5998655/f71bc6f0bc05/iax013f5.jpg

相似文献

1
Weighted mining of massive collections of [Formula: see text]-values by convex optimization.通过凸优化对大量[公式:见文本]值集合进行加权挖掘。
Inf inference. 2018 Jun;7(2):251-275. doi: 10.1093/imaiai/iax013. Epub 2017 Dec 8.
2
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
3
Optimal multiple testing under a Gaussian prior on the effect sizes.效应量的高斯先验下的最优多重检验。
Biometrika. 2015 Dec;102(4):753-766. doi: 10.1093/biomet/asv050. Epub 2015 Nov 4.
4
Regularized gradient-projection methods for finding the minimum-norm solution of the constrained convex minimization problem.用于寻找约束凸最小化问题的最小范数解的正则化梯度投影方法。
J Inequal Appl. 2017;2017(1):13. doi: 10.1186/s13660-016-1289-4. Epub 2017 Jan 9.
5
New Software for the Fast Estimation of Population Recombination Rates (FastEPRR) in the Genomic Era.基因组时代用于快速估计群体重组率的新软件(FastEPRR)
G3 (Bethesda). 2016 Jun 1;6(6):1563-71. doi: 10.1534/g3.116.028233.
6
Constructive techniques for zeros of monotone mappings in certain Banach spaces.某些巴拿赫空间中单调映射零点的构造性技术。
Springerplus. 2015 Jul 28;4:383. doi: 10.1186/s40064-015-1169-2. eCollection 2015.
7
Macromolecular crowding: chemistry and physics meet biology (Ascona, Switzerland, 10-14 June 2012).大分子拥挤现象:化学与物理邂逅生物学(瑞士阿斯科纳,2012年6月10日至14日)
Phys Biol. 2013 Aug;10(4):040301. doi: 10.1088/1478-3975/10/4/040301. Epub 2013 Aug 2.
8
Methods for visual mining of genomic and proteomic data atlases.基因组和蛋白质组数据图谱的可视化挖掘方法。
BMC Bioinformatics. 2012 Apr 23;13:58. doi: 10.1186/1471-2105-13-58.
9
Regional and temporal variability of the isotope composition (O, S) of atmospheric sulphate in the region of Freiberg, Germany, and consequences for dissolved sulphate in groundwater and river water.德国弗赖贝格地区大气硫酸盐同位素组成(氧、硫)的区域和时间变化及其对地下水和河水中溶解硫酸盐的影响。
Isotopes Environ Health Stud. 2012;48(1):118-43. doi: 10.1080/10256016.2011.624273. Epub 2011 Nov 17.
10
Dynamic Visualization and Fast Computation for Convex Clustering via Algorithmic Regularization.通过算法正则化实现凸聚类的动态可视化与快速计算
J Comput Graph Stat. 2020;29(1):87-96. doi: 10.1080/10618600.2019.1629943. Epub 2019 Jul 19.

引用本文的文献

1
OPTIMAL FALSE DISCOVERY RATE CONTROL FOR LARGE SCALE MULTIPLE TESTING WITH AUXILIARY INFORMATION.利用辅助信息进行大规模多重检验的最优错误发现率控制
Ann Stat. 2022 Apr;50(2):807-857. doi: 10.1214/21-aos2128. Epub 2022 Apr 7.

本文引用的文献

1
Data-driven hypothesis weighting increases detection power in genome-scale multiple testing.数据驱动的假设加权提高了基因组规模多重检验中的检测能力。
Nat Methods. 2016 Jul;13(7):577-80. doi: 10.1038/nmeth.3885. Epub 2016 May 30.
2
Optimal multiple testing under a Gaussian prior on the effect sizes.效应量的高斯先验下的最优多重检验。
Biometrika. 2015 Dec;102(4):753-766. doi: 10.1093/biomet/asv050. Epub 2015 Nov 4.
3
Weighting sequence variants based on their annotation increases power of whole-genome association studies.基于注释对序列变异进行加权可提高全基因组关联研究的效能。
Nat Genet. 2016 Mar;48(3):314-7. doi: 10.1038/ng.3507. Epub 2016 Feb 8.
4
Genome-Wide Scan Informed by Age-Related Disease Identifies Loci for Exceptional Human Longevity.基于年龄相关性疾病的全基因组扫描确定了人类超长寿命的基因座。
PLoS Genet. 2015 Dec 17;11(12):e1005728. doi: 10.1371/journal.pgen.1005728. eCollection 2015 Dec.
5
Common genetic variants associated with cognitive performance identified using the proxy-phenotype method.使用代理表型法鉴定出的与认知能力相关的常见基因变异。
Proc Natl Acad Sci U S A. 2014 Sep 23;111(38):13790-4. doi: 10.1073/pnas.1404623111. Epub 2014 Sep 8.
6
POWER-ENHANCED MULTIPLE DECISION FUNCTIONS CONTROLLING FAMILY-WISE ERROR AND FALSE DISCOVERY RATES.控制家族性错误率和错误发现率的功率增强型多重决策函数
Ann Stat. 2011 Feb;39(1):556-583. doi: 10.1214/10-aos844.
7
Genome-wide association meta-analysis of human longevity identifies a novel locus conferring survival beyond 90 years of age.人类长寿的全基因组关联荟萃分析确定了一个新的基因座,该基因座可使人活到90岁以上。
Hum Mol Genet. 2014 Aug 15;23(16):4420-32. doi: 10.1093/hmg/ddu139. Epub 2014 Mar 31.
8
An estimate of the science-wise false discovery rate and application to the top medical literature.科学明智的假发现率估计及其在顶级医学文献中的应用。
Biostatistics. 2014 Jan;15(1):1-12. doi: 10.1093/biostatistics/kxt007. Epub 2013 Sep 25.
9
Using eQTL weights to improve power for genome-wide association studies: a genetic study of childhood asthma.利用 eQTL 权重提高全基因组关联研究的功效:一项儿童哮喘的遗传学研究。
Front Genet. 2013 May 31;4:103. doi: 10.3389/fgene.2013.00103. eCollection 2013.
10
Improved detection of common variants associated with schizophrenia by leveraging pleiotropy with cardiovascular-disease risk factors.利用与心血管疾病风险因素的多效性提高与精神分裂症相关的常见变异的检测。
Am J Hum Genet. 2013 Feb 7;92(2):197-209. doi: 10.1016/j.ajhg.2013.01.001. Epub 2013 Jan 31.