Suppr超能文献

基于 DNA 甲基化的健康预测指标:应用和统计考虑。

DNA methylation-based predictors of health: applications and statistical considerations.

机构信息

Medical Research Council Integrative Epidemiology Unit at the University of Bristol, University of Bristol, Bristol, UK.

出版信息

Nat Rev Genet. 2022 Jun;23(6):369-383. doi: 10.1038/s41576-022-00465-w. Epub 2022 Mar 18.

Abstract

DNA methylation data have become a valuable source of information for biomarker development, because, unlike static genetic risk estimates, DNA methylation varies dynamically in relation to diverse exogenous and endogenous factors, including environmental risk factors and complex disease pathology. Reliable methods for genome-wide measurement at scale have led to the proliferation of epigenome-wide association studies and subsequently to the development of DNA methylation-based predictors across a wide range of health-related applications, from the identification of risk factors or exposures, such as age and smoking, to early detection of disease or progression in cancer, cardiovascular and neurological disease. This Review evaluates the progress of existing DNA methylation-based predictors, including the contribution of machine learning techniques, and assesses the uptake of key statistical best practices needed to ensure their reliable performance, such as data-driven feature selection, elimination of data leakage in performance estimates and use of generalizable, adequately powered training samples.

摘要

DNA 甲基化数据已成为生物标志物开发的宝贵信息来源,因为与静态遗传风险估计不同,DNA 甲基化会随各种外源性和内源性因素(包括环境风险因素和复杂疾病病理)而动态变化。可大规模进行全基因组测量的可靠方法,导致了全基因组关联研究的大量涌现,并随后开发出了基于 DNA 甲基化的预测因子,这些预测因子广泛应用于各种与健康相关的领域,从识别风险因素或暴露因素(如年龄和吸烟),到癌症、心血管和神经疾病的早期检测或疾病进展。这篇综述评估了现有的基于 DNA 甲基化的预测因子的进展,包括机器学习技术的贡献,并评估了采用关键统计最佳实践的情况,这些最佳实践对于确保其可靠性能是必要的,例如数据驱动的特征选择、消除性能估计中的数据泄漏以及使用可推广的、充分加权的训练样本。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验