• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于超高维中介分析的去偏机器学习

Debiased machine learning for ultra-high dimensional mediation analysis.

作者信息

Wei Kecheng, Liu Yahang, Huang Chen, Lin Ruilang, Yu Yongfu, Qin Guoyou

机构信息

Department of Biostatistics, School of Public Health, Fudan University, Shanghai 200032, China.

出版信息

Bioinformatics. 2025 Jun 2;41(6). doi: 10.1093/bioinformatics/btaf282.

DOI:10.1093/bioinformatics/btaf282
PMID:40323319
Abstract

MOTIVATION

In ultra-high dimensional mediation analysis, confounding variables can influence both mediators and outcomes through complex functional forms. While machine learning (ML) approaches are effective at modeling such complex relationships, they can introduce bias when estimating mediation effects. In this article, we propose a debiased ML framework that mitigates this bias, enabling accurate identification of key mediators and precise estimation and inference of their respective contributions.

RESULTS

We construct an orthogonalized score function and use cross-fitting to reduce bias introduced by ML. To tackle ultra-high dimensional potential mediators, we implement screening and regularization techniques for variable selection and effect estimation. For statistical inference of the mediators' contributions, we use an adjusted Sobel-type test. Simulation results demonstrate the superior performance of the proposed method in handling complex confounding. Applying this method to Alzheimer's Disease Neuroimaging Initiative data, we identify several cytosine-phosphate-guanine sites where DNA methylation mediates the effect of body mass index on Alzheimer's Disease.

AVAILABILITY AND IMPLEMENTATION

The R function DML_HDMA implementing the proposed methods is available online at https://github.com/Wei-Kecheng/DML_HDMA.

摘要

动机

在超高维中介分析中,混杂变量可以通过复杂的函数形式影响中介变量和结果变量。虽然机器学习(ML)方法在对这种复杂关系进行建模时很有效,但在估计中介效应时可能会引入偏差。在本文中,我们提出了一个去偏机器学习框架,以减轻这种偏差,从而能够准确识别关键中介变量,并精确估计和推断它们各自的贡献。

结果

我们构建了一个正交化得分函数,并使用交叉拟合来减少机器学习引入的偏差。为了处理超高维潜在中介变量,我们实施了筛选和正则化技术进行变量选择和效应估计。对于中介变量贡献的统计推断,我们使用了一种调整后的Sobel型检验。模拟结果证明了所提出方法在处理复杂混杂因素方面的优越性能。将该方法应用于阿尔茨海默病神经影像倡议数据,我们识别出了几个胞嘧啶-磷酸-鸟嘌呤位点,其中DNA甲基化介导了体重指数对阿尔茨海默病的影响。

可用性和实现

实现所提出方法的R函数DML_HDMA可在https://github.com/Wei-Kecheng/DML_HDMA上在线获取。

相似文献

1
Debiased machine learning for ultra-high dimensional mediation analysis.用于超高维中介分析的去偏机器学习
Bioinformatics. 2025 Jun 2;41(6). doi: 10.1093/bioinformatics/btaf282.
2
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
3
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
4
Perceptions and experiences of the prevention, detection, and management of postpartum haemorrhage: a qualitative evidence synthesis.预防、检测和管理产后出血的认知和经验:定性证据综合。
Cochrane Database Syst Rev. 2023 Nov 27;11(11):CD013795. doi: 10.1002/14651858.CD013795.pub2.
5
18F PET with flutemetamol for the early diagnosis of Alzheimer's disease dementia and other dementias in people with mild cognitive impairment (MCI).使用氟代甲磺酸去甲肾上腺素的18F正电子发射断层显像用于轻度认知障碍(MCI)患者中阿尔茨海默病性痴呆及其他痴呆的早期诊断。
Cochrane Database Syst Rev. 2017 Nov 22;11(11):CD012884. doi: 10.1002/14651858.CD012884.
6
Community views on mass drug administration for soil-transmitted helminths: a qualitative evidence synthesis.社区对土壤传播蠕虫群体药物给药的看法:定性证据综合分析
Cochrane Database Syst Rev. 2025 Jun 20;6:CD015794. doi: 10.1002/14651858.CD015794.pub2.
7
Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?当前的生存预测工具在治疗骨转移后的骨骼相关事件时有用吗?
Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.
8
Psychological interventions for adults who have sexually offended or are at risk of offending.针对有性犯罪行为或有性犯罪风险的成年人的心理干预措施。
Cochrane Database Syst Rev. 2012 Dec 12;12(12):CD007507. doi: 10.1002/14651858.CD007507.pub2.
9
Serum and urine nucleic acid screening tests for BK polyomavirus-associated nephropathy in kidney and kidney-pancreas transplant recipients.肾移植和肾胰联合移植受者中BK多瘤病毒相关性肾病的血清和尿液核酸筛查试验
Cochrane Database Syst Rev. 2024 Nov 28;11(11):CD014839. doi: 10.1002/14651858.CD014839.pub2.
10
How lived experiences of illness trajectories, burdens of treatment, and social inequalities shape service user and caregiver participation in health and social care: a theory-informed qualitative evidence synthesis.疾病轨迹的生活经历、治疗负担和社会不平等如何影响服务使用者和照顾者参与健康和社会护理:一项基于理论的定性证据综合分析
Health Soc Care Deliv Res. 2025 Jun;13(24):1-120. doi: 10.3310/HGTQ8159.

本文引用的文献

1
High-dimensional generalized median adaptive lasso with application to omics data.适用于组学数据的高维广义中位数自适应套索法
Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae059.
2
Using instrumental variables to address unmeasured confounding in causal mediation analysis.使用工具变量解决因果中介分析中未测量的混杂。
Biometrics. 2024 Jan 29;80(1). doi: 10.1093/biomtc/ujad037.
3
CD163-Mediated Small-Vessel Injury in Alzheimer's Disease: An Exploration from Neuroimaging to Transcriptomics.阿尔茨海默病中 CD163 介导的小血管损伤:从神经影像学到转录组学的探索。
Int J Mol Sci. 2024 Feb 14;25(4):2293. doi: 10.3390/ijms25042293.
4
DP2LM: leveraging deep learning approach for estimation and hypothesis testing on mediation effects with high-dimensional mediators and complex confounders.DP2LM:利用深度学习方法对具有高维中介变量和复杂混杂因素的中介效应进行估计和假设检验。
Biostatistics. 2024 Jul 1;25(3):818-832. doi: 10.1093/biostatistics/kxad037.
5
High-dimensional quantile mediation analysis with application to a birth cohort study of mother-newborn pairs.高维分位数中介分析及其在母婴队列研究中的应用。
Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae055.
6
Methods for mediation analysis with high-dimensional DNA methylation data: Possible choices and comparisons.高维 DNA 甲基化数据分析中的中介分析方法:可能的选择和比较。
PLoS Genet. 2023 Nov 7;19(11):e1011022. doi: 10.1371/journal.pgen.1011022. eCollection 2023 Nov.
7
Methods for large-scale single mediator hypothesis testing: Possible choices and comparisons.大规模单因子假设检验方法:可能的选择与比较。
Genet Epidemiol. 2023 Mar;47(2):167-184. doi: 10.1002/gepi.22510. Epub 2022 Dec 8.
8
Mediation by DNA methylation on the association of BMI and serum uric acid in Chinese monozygotic twins.DNA 甲基化在 BMI 和血清尿酸在中国同卵双胞胎关联中的中介作用。
Gene. 2023 Jan 20;850:146957. doi: 10.1016/j.gene.2022.146957. Epub 2022 Oct 12.
9
Large-Scale Hypothesis Testing for Causal Mediation Effects with Applications in Genome-wide Epigenetic Studies.因果中介效应的大规模假设检验及其在全基因组表观遗传学研究中的应用
J Am Stat Assoc. 2022;117(537):67-81. doi: 10.1080/01621459.2021.1914634. Epub 2021 May 19.
10
Pathway Lasso: Pathway Estimation and Selection with High-Dimensional Mediators.通路套索法:利用高维中介变量进行通路估计与选择
Stat Interface. 2022;15(1):39-50. doi: 10.4310/21-sii673. Epub 2021 Aug 11.