• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于网络的正则化在高维 DNA 甲基化数据匹配病例对照分析中的应用。

Network-based regularization for matched case-control analysis of high-dimensional DNA methylation data.

机构信息

Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY 10032, USA.

出版信息

Stat Med. 2013 May 30;32(12):2127-39. doi: 10.1002/sim.5694. Epub 2012 Dec 5.

DOI:10.1002/sim.5694
PMID:23212810
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4038397/
Abstract

The matched case-control designs are commonly used to control for potential confounding factors in genetic epidemiology studies especially epigenetic studies with DNA methylation. Compared with unmatched case-control studies with high-dimensional genomic or epigenetic data, there have been few variable selection methods for matched sets. In an earlier paper, we proposed the penalized logistic regression model for the analysis of unmatched DNA methylation data using a network-based penalty. However, for popularly applied matched designs in epigenetic studies that compare DNA methylation between tumor and adjacent non-tumor tissues or between pre-treatment and post-treatment conditions, applying ordinary logistic regression ignoring matching is known to bring serious bias in estimation. In this paper, we developed a penalized conditional logistic model using the network-based penalty that encourages a grouping effect of (1) linked Cytosine-phosphate-Guanine (CpG) sites within a gene or (2) linked genes within a genetic pathway for analysis of matched DNA methylation data. In our simulation studies, we demonstrated the superiority of using conditional logistic model over unconditional logistic model in high-dimensional variable selection problems for matched case-control data. We further investigated the benefits of utilizing biological group or graph information for matched case-control data. We applied the proposed method to a genome-wide DNA methylation study on hepatocellular carcinoma (HCC) where we investigated the DNA methylation levels of tumor and adjacent non-tumor tissues from HCC patients by using the Illumina Infinium HumanMethylation27 Beadchip. Several new CpG sites and genes known to be related to HCC were identified but were missed by the standard method in the original paper.

摘要

匹配病例对照设计通常用于控制遗传流行病学研究,特别是 DNA 甲基化的表观遗传学研究中的潜在混杂因素。与具有高维基因组或表观遗传数据的不匹配病例对照研究相比,针对匹配数据集的变量选择方法较少。在早期的一篇论文中,我们提出了一种基于网络惩罚的惩罚逻辑回归模型,用于分析不匹配的 DNA 甲基化数据。然而,对于表观遗传学研究中常用的匹配设计,即比较肿瘤和相邻非肿瘤组织之间或预处理和后处理条件之间的 DNA 甲基化,忽略匹配的普通逻辑回归已知会导致估计严重偏倚。在本文中,我们开发了一种基于网络惩罚的惩罚条件逻辑回归模型,该模型鼓励(1)基因内连接的胞嘧啶-磷酸-鸟嘌呤(CpG)位点或(2)遗传途径内连接的基因的分组效应,用于分析匹配的 DNA 甲基化数据。在我们的模拟研究中,我们证明了在高维变量选择问题中,使用条件逻辑回归模型优于无条件逻辑回归模型。我们进一步研究了利用生物学组或图形信息对匹配病例对照数据的益处。我们将所提出的方法应用于肝细胞癌(HCC)的全基因组 DNA 甲基化研究,其中我们使用 Illumina Infinium HumanMethylation27 Beadchip 研究了 HCC 患者肿瘤和相邻非肿瘤组织的 DNA 甲基化水平。鉴定了几个新的 CpG 位点和已知与 HCC 相关的基因,但在原始论文的标准方法中被遗漏了。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/06eb/4038397/bfac51e8f033/nihms482510f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/06eb/4038397/700608134e3a/nihms482510f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/06eb/4038397/bfac51e8f033/nihms482510f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/06eb/4038397/700608134e3a/nihms482510f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/06eb/4038397/bfac51e8f033/nihms482510f2.jpg

相似文献

1
Network-based regularization for matched case-control analysis of high-dimensional DNA methylation data.基于网络的正则化在高维 DNA 甲基化数据匹配病例对照分析中的应用。
Stat Med. 2013 May 30;32(12):2127-39. doi: 10.1002/sim.5694. Epub 2012 Dec 5.
2
Incorporating genetic networks into case-control association studies with high-dimensional DNA methylation data.将遗传网络纳入具有高维 DNA 甲基化数据的病例对照关联研究中。
BMC Bioinformatics. 2019 Oct 22;20(1):510. doi: 10.1186/s12859-019-3040-x.
3
Penalized logistic regression for high-dimensional DNA methylation data with case-control studies.带有病例对照研究的高维 DNA 甲基化数据的惩罚逻辑回归。
Bioinformatics. 2012 May 15;28(10):1368-75. doi: 10.1093/bioinformatics/bts145. Epub 2012 Mar 30.
4
Exploring genome-wide DNA methylation profiles altered in hepatocellular carcinoma using Infinium HumanMethylation 450 BeadChips.利用 Infinium HumanMethylation 450 BeadChips 探索肝癌中全基因组 DNA 甲基化谱的改变。
Epigenetics. 2013 Jan;8(1):34-43. doi: 10.4161/epi.23062. Epub 2012 Dec 3.
5
Penalized logistic regression based on L1/2 penalty for high-dimensional DNA methylation data.基于 L1/2 惩罚的高维 DNA 甲基化数据惩罚逻辑回归。
Technol Health Care. 2020;28(S1):161-171. doi: 10.3233/THC-209016.
6
Genome-wide methylation analysis and epigenetic unmasking identify tumor suppressor genes in hepatocellular carcinoma.全基因组甲基化分析和表观遗传去抑制鉴定肝癌中的肿瘤抑制基因。
Gastroenterology. 2013 Dec;145(6):1424-35.e1-25. doi: 10.1053/j.gastro.2013.08.055. Epub 2013 Sep 5.
7
Using Illumina Infinium HumanMethylation 450K BeadChip to explore genome‑wide DNA methylation profiles in a human hepatocellular carcinoma cell line.采用 Illumina Infinium HumanMethylation 450K BeadChip 技术探索人肝癌细胞系的全基因组 DNA 甲基化图谱。
Mol Med Rep. 2018 Nov;18(5):4446-4456. doi: 10.3892/mmr.2018.9441. Epub 2018 Sep 3.
8
Correlation of Infinium HumanMethylation450K and MethylationEPIC BeadChip arrays in cartilage.Infinium HumanMethylation450K芯片与甲基化EPIC芯片在软骨中的相关性
Epigenetics. 2020 Jun-Jul;15(6-7):594-603. doi: 10.1080/15592294.2019.1700003. Epub 2019 Dec 13.
9
Systematic evaluation of DNA methylation age estimation with common preprocessing methods and the Infinium MethylationEPIC BeadChip array.采用常见预处理方法和 Infinium MethylationEPIC BeadChip 阵列进行 DNA 甲基化年龄估算的系统评价。
Clin Epigenetics. 2018 Oct 16;10(1):123. doi: 10.1186/s13148-018-0556-2.
10
pETM: a penalized Exponential Tilt Model for analysis of correlated high-dimensional DNA methylation data.pETM:一种用于分析相关高维DNA甲基化数据的惩罚指数倾斜模型。
Bioinformatics. 2017 Jun 15;33(12):1765-1772. doi: 10.1093/bioinformatics/btx064.

引用本文的文献

1
Weighted overlapping group lasso for integrating prior network knowledge into gene set analysis.用于将先验网络知识整合到基因集分析中的加权重叠组套索法。
BMC Bioinformatics. 2025 Sep 1;26(1):226. doi: 10.1186/s12859-025-06170-9.
2
Causality-driven candidate identification for reliable DNA methylation biomarker discovery.用于可靠DNA甲基化生物标志物发现的因果关系驱动的候选物识别
Nat Commun. 2025 Jan 15;16(1):680. doi: 10.1038/s41467-025-56054-y.
3
New statistical selection method for pleiotropic variants associated with both quantitative and qualitative traits.

本文引用的文献

1
Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent.通过坐标下降法求解Cox比例风险模型的正则化路径
J Stat Softw. 2011 Mar;39(5):1-13. doi: 10.18637/jss.v039.i05.
2
VARIABLE SELECTION AND REGRESSION ANALYSIS FOR GRAPH-STRUCTURED COVARIATES WITH AN APPLICATION TO GENOMICS.具有基因组学应用的图结构协变量的变量选择与回归分析
Ann Appl Stat. 2010 Sep 1;4(3):1498-1516. doi: 10.1214/10-AOAS332.
3
Penalized logistic regression for high-dimensional DNA methylation data with case-control studies.带有病例对照研究的高维 DNA 甲基化数据的惩罚逻辑回归。
一种新的统计选择方法,用于与数量性状和质量性状都相关的多效变异体。
BMC Bioinformatics. 2023 Oct 10;24(1):381. doi: 10.1186/s12859-023-05505-8.
4
Gene selection by incorporating genetic networks into case-control association studies.通过将基因网络纳入病例对照关联研究进行基因选择。
Eur J Hum Genet. 2024 Mar;32(3):270-277. doi: 10.1038/s41431-022-01264-x. Epub 2022 Dec 19.
5
Molecular markers of risk of subsequent invasive breast cancer in women with ductal carcinoma in situ: protocol for a population-based cohort study.基于人群队列研究的原位导管癌女性后续浸润性乳腺癌风险的分子标志物:研究方案。
BMJ Open. 2021 Oct 26;11(10):e053397. doi: 10.1136/bmjopen-2021-053397.
6
Gene-Environment Interaction: A Variable Selection Perspective.基因-环境相互作用:变量选择视角
Methods Mol Biol. 2021;2212:191-223. doi: 10.1007/978-1-0716-0947-7_13.
7
Genetic Diversity and Genome-Wide Association Study of Seed Aspect Ratio Using a High-Density SNP Array in Peanut ( L.).利用高密度SNP阵列对花生(.)种子长宽比进行遗传多样性分析和全基因组关联研究
Genes (Basel). 2020 Dec 22;12(1):2. doi: 10.3390/genes12010002.
8
Incorporating genetic networks into case-control association studies with high-dimensional DNA methylation data.将遗传网络纳入具有高维 DNA 甲基化数据的病例对照关联研究中。
BMC Bioinformatics. 2019 Oct 22;20(1):510. doi: 10.1186/s12859-019-3040-x.
9
A Review of Matched-pairs Feature Selection Methods for Gene Expression Data Analysis.基因表达数据分析中配对特征选择方法综述
Comput Struct Biotechnol J. 2018 Feb 25;16:88-97. doi: 10.1016/j.csbj.2018.02.005. eCollection 2018.
10
pETM: a penalized Exponential Tilt Model for analysis of correlated high-dimensional DNA methylation data.pETM:一种用于分析相关高维DNA甲基化数据的惩罚指数倾斜模型。
Bioinformatics. 2017 Jun 15;33(12):1765-1772. doi: 10.1093/bioinformatics/btx064.
Bioinformatics. 2012 May 15;28(10):1368-75. doi: 10.1093/bioinformatics/bts145. Epub 2012 Mar 30.
4
Genome-wide DNA methylation profiles in hepatocellular carcinoma.肝细胞癌的全基因组 DNA 甲基化图谱。
Hepatology. 2012 Jun;55(6):1799-808. doi: 10.1002/hep.25569. Epub 2012 Apr 24.
5
COORDINATE DESCENT ALGORITHMS FOR NONCONVEX PENALIZED REGRESSION, WITH APPLICATIONS TO BIOLOGICAL FEATURE SELECTION.用于非凸惩罚回归的坐标下降算法及其在生物特征选择中的应用
Ann Appl Stat. 2011 Jan 1;5(1):232-253. doi: 10.1214/10-AOAS388.
6
Pseudosibship methods in the case-parents design.病例-父母设计中的拟亲缘关系方法。
Stat Med. 2011 Nov 30;30(27):3236-51. doi: 10.1002/sim.4397. Epub 2011 Sep 23.
7
Significance analysis and statistical dissection of variably methylated regions.可变甲基化区域的意义分析和统计剖析。
Biostatistics. 2012 Jan;13(1):166-78. doi: 10.1093/biostatistics/kxr013. Epub 2011 Jun 17.
8
Association of HLA-G 3' UTR 14-bp insertion/deletion polymorphism with hepatocellular carcinoma susceptibility in a Chinese population.HLA-G 3'UTR 14-bp 插入/缺失多态性与中国人群肝癌易感性的关联。
DNA Cell Biol. 2011 Dec;30(12):1027-32. doi: 10.1089/dna.2011.1238. Epub 2011 May 25.
9
Distinct DNA methylation patterns in cirrhotic liver and hepatocellular carcinoma.肝硬化肝组织和肝癌组织中的特异性 DNA 甲基化模式。
Int J Cancer. 2012 Mar 15;130(6):1319-28. doi: 10.1002/ijc.26136. Epub 2011 Jul 21.
10
Incorporating biological pathways via a Markov random field model in genome-wide association studies.通过马尔可夫随机场模型将生物途径纳入全基因组关联研究中。
PLoS Genet. 2011 Apr;7(4):e1001353. doi: 10.1371/journal.pgen.1001353. Epub 2011 Apr 7.