BioMethyl：一个用于 DNA 甲基化数据生物学解释的 R 包。

BioMethyl: an R package for biological interpretation of DNA methylation data.

机构信息

Department of Molecular and Systems Biology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA.

Department of Biomedical Data Science, Geisel School of Medicine at Dartmouth, Lebanon, NH, USA.

出版信息

Bioinformatics. 2019 Oct 1;35(19):3635-3641. doi: 10.1093/bioinformatics/btz137.

DOI:10.1093/bioinformatics/btz137

PMID:30799505

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6761945/

Abstract

MOTIVATION

The accumulation of publicly available DNA methylation datasets has resulted in the need for tools to interpret the specific cellular phenotypes in bulk tissue data. Current approaches use either single differentially methylated CpG sites or differentially methylated regions that map to genes. However, these approaches may introduce biases in downstream analyses of biological interpretation, because of the variability in gene length. There is a lack of approaches to interpret DNA methylation effectively. Therefore, we have developed computational models to provide biological interpretation of relevant gene sets using DNA methylation data in the context of The Cancer Genome Atlas.

RESULTS

We illustrate that Biological interpretation of DNA Methylation (BioMethyl) utilizes the complete DNA methylation data for a given cancer type to reflect corresponding gene expression profiles and performs pathway enrichment analyses, providing unique biological insight. Using breast cancer as an example, BioMethyl shows high consistency in the identification of enriched biological pathways from DNA methylation data compared to the results calculated from RNA sequencing data. We find that 12 out of 14 pathways identified by BioMethyl are shared with those by using RNA-seq data, with a Jaccard score 0.8 for estrogen receptor (ER) positive samples. For ER negative samples, three pathways are shared in the two enrichments with a slight lower similarity (Jaccard score = 0.6). Using BioMethyl, we can successfully identify those hidden biological pathways in DNA methylation data when gene expression profile is lacking.

AVAILABILITY AND IMPLEMENTATION

BioMethyl R package is freely available in the GitHub repository (https://github.com/yuewangpanda/BioMethyl).

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

公开可用的 DNA 甲基化数据集的积累导致需要工具来解释批量组织数据中的特定细胞表型。当前的方法使用单个差异甲基化 CpG 位点或映射到基因的差异甲基化区域。然而，由于基因长度的可变性，这些方法可能会在下游生物学解释分析中引入偏差。目前缺乏有效的解释 DNA 甲基化的方法。因此，我们开发了计算模型，以便在癌症基因组图谱的背景下使用 DNA 甲基化数据对相关基因集进行生物学解释。

结果

我们说明了 DNA 甲基化的生物学解释（BioMethyl）利用给定癌症类型的完整 DNA 甲基化数据来反映相应的基因表达谱，并进行途径富集分析，从而提供独特的生物学见解。以乳腺癌为例，与从 RNA 测序数据计算得出的结果相比，BioMethyl 在从 DNA 甲基化数据中识别富集的生物学途径方面具有高度一致性。我们发现，BioMethyl 确定的 14 条途径中有 12 条与使用 RNA-seq 数据确定的途径相同，对于雌激素受体（ER）阳性样本，Jaccard 得分 0.8。对于 ER 阴性样本，在两个富集中共享三个途径，相似性略低（Jaccard 得分=0.6）。使用 BioMethyl，我们可以在缺乏基因表达谱的情况下成功地从 DNA 甲基化数据中识别出那些隐藏的生物学途径。

可用性和实现

BioMethyl R 包可在 GitHub 存储库（https://github.com/yuewangpanda/BioMethyl）中免费获得。

补充信息

补充数据可在生物信息学在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3bf/6761945/c6ae2ea60f95/btz137f1.jpg

相似文献

BioMethyl: an R package for biological interpretation of DNA methylation data.BioMethyl：一个用于 DNA 甲基化数据生物学解释的 R 包。

Bioinformatics. 2019 Oct 1;35(19):3635-3641. doi: 10.1093/bioinformatics/btz137.

A comprehensive evaluation of alignment software for reduced representation bisulfite sequencing data.简化基因组重亚硫酸盐测序数据比对软件的综合评估。

Bioinformatics. 2018 Aug 15;34(16):2715-2723. doi: 10.1093/bioinformatics/bty174.

DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data.DMRfinder：从甲基化DNA免疫沉淀测序数据中高效识别差异甲基化区域。

BMC Bioinformatics. 2017 Nov 29;18(1):528. doi: 10.1186/s12859-017-1909-0.

MeDEStrand: an improved method to infer genome-wide absolute methylation levels from DNA enrichment data.MeDEStrand：一种从 DNA 富集数据推断全基因组绝对甲基化水平的改进方法。

BMC Bioinformatics. 2018 Dec 22;19(1):540. doi: 10.1186/s12859-018-2574-7.

MBDDiff: an R package designed specifically for processing MBDcap-seq datasets.MBDDiff：一个专门设计用于处理MBDcap-seq数据集的R软件包。

BMC Genomics. 2016 Aug 18;17 Suppl 4(Suppl 4):432. doi: 10.1186/s12864-016-2794-z.

Methylation-level inferences and detection of differential methylation with MeDIP-seq data.基于 MeDIP-seq 数据的甲基化水平推断和差异甲基化检测。

PLoS One. 2018 Aug 7;13(8):e0201586. doi: 10.1371/journal.pone.0201586. eCollection 2018.

seqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data.Seqlm：一种基于最小描述长度的方法，用于在高密度甲基化阵列数据中识别差异甲基化区域。

Bioinformatics. 2016 Sep 1;32(17):2604-10. doi: 10.1093/bioinformatics/btw304. Epub 2016 May 13.

CpGtools: a python package for DNA methylation analysis.CpGtools：一个用于 DNA 甲基化分析的 Python 包。

Bioinformatics. 2021 Jul 12;37(11):1598-1599. doi: 10.1093/bioinformatics/btz916.

mCSEA: detecting subtle differentially methylated regions.mCSEA：检测微妙的差异甲基化区域。

Bioinformatics. 2019 Sep 15;35(18):3257-3262. doi: 10.1093/bioinformatics/btz096.

Detection of significantly differentially methylated regions in targeted bisulfite sequencing data.靶向亚硫酸氢盐测序数据中差异甲基化区域的检测。

Bioinformatics. 2013 Jul 1;29(13):1647-53. doi: 10.1093/bioinformatics/btt263. Epub 2013 May 8.

引用本文的文献

MOSES: a methylation-based gene association approach for unveiling environmentally regulated genes linked to a trait or disease.MOSES：一种基于甲基化的基因关联方法，用于揭示与特征或疾病相关的受环境调控的基因。

Clin Epigenetics. 2024 Nov 18;16(1):161. doi: 10.1186/s13148-024-01776-x.

Identifying cell lines across pan-cancer to be used in preclinical research as a proxy for patient tumor samples.鉴定泛癌中的细胞系，作为临床前研究中患者肿瘤样本的替代物。

Commun Biol. 2024 Sep 7;7(1):1101. doi: 10.1038/s42003-024-06812-3.

DNA quality evaluation of formalin-fixed paraffin-embedded heart tissue for DNA methylation array analysis.甲醛固定石蜡包埋心脏组织的 DNA 质量评估用于 DNA 甲基化阵列分析。

Sci Rep. 2023 Feb 3;13(1):2004. doi: 10.1038/s41598-023-29120-y.

Challenges in Analyzing Functional Epigenetic Data in Perspective of Adolescent Psychiatric Health.从青少年精神健康角度分析功能表观遗传学数据的挑战。

Int J Mol Sci. 2022 May 23;23(10):5856. doi: 10.3390/ijms23105856.

Stage-differentiated ensemble modeling of DNA methylation landscapes uncovers salient biomarkers and prognostic signatures in colorectal cancer progression.基于 DNA 甲基化景观的阶段差异化集成建模揭示了结直肠癌进展中的显著生物标志物和预后特征。

PLoS One. 2022 Feb 24;17(2):e0249151. doi: 10.1371/journal.pone.0249151. eCollection 2022.

Estimands in epigenome-wide association studies.全基因组关联研究中的可估计量。

Clin Epigenetics. 2021 Apr 29;13(1):98. doi: 10.1186/s13148-021-01083-9.

MethCORR infers gene expression from DNA methylation and allows molecular analysis of ten common cancer types using fresh-frozen and formalin-fixed paraffin-embedded tumor samples.MethCORR 从 DNA 甲基化推断基因表达，并允许使用新鲜冷冻和福尔马林固定石蜡包埋的肿瘤样本对十种常见癌症类型进行分子分析。

Clin Epigenetics. 2021 Jan 28;13(1):20. doi: 10.1186/s13148-021-01000-0.

A Linear Regression and Deep Learning Approach for Detecting Reliable Genetic Alterations in Cancer Using DNA Methylation and Gene Expression Data.基于 DNA 甲基化和基因表达数据的线性回归和深度学习方法在癌症中检测可靠的遗传改变。

Genes (Basel). 2020 Aug 12;11(8):931. doi: 10.3390/genes11080931.

Vertical integration methods for gene expression data analysis.基因表达数据分析的垂直整合方法。

Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa169.

Collective effects of long-range DNA methylations predict gene expressions and estimate phenotypes in cancer.长程 DNA 甲基化的集体效应可预测癌症中的基因表达并估计表型。

Sci Rep. 2020 Mar 3;10(1):3920. doi: 10.1038/s41598-020-60845-2.

本文引用的文献

Modeling complex patterns of differential DNA methylation that associate with gene expression changes.对与基因表达变化相关的复杂DNA甲基化差异模式进行建模。

Nucleic Acids Res. 2017 May 19;45(9):5100-5111. doi: 10.1093/nar/gkx078.

Association between DNA Methylation in Whole Blood and Measures of Glucose Metabolism: KORA F4 Study.全血DNA甲基化与葡萄糖代谢指标之间的关联：KORA F4研究

PLoS One. 2016 Mar 28;11(3):e0152314. doi: 10.1371/journal.pone.0152314. eCollection 2016.

ENmix: a novel background correction method for Illumina HumanMethylation450 BeadChip.ENmix：一种用于Illumina HumanMethylation450 BeadChip的新型背景校正方法。

Nucleic Acids Res. 2016 Feb 18;44(3):e20. doi: 10.1093/nar/gkv907. Epub 2015 Sep 17.

Aberrant Expression of proPTPRN2 in Cancer Cells Confers Resistance to Apoptosis.癌细胞中proPTPRN2的异常表达赋予其抗凋亡能力。

Cancer Res. 2015 May 1;75(9):1846-58. doi: 10.1158/0008-5472.CAN-14-2718. Epub 2015 Apr 15.

Using epigenomics data to predict gene expression in lung cancer.利用表观基因组学数据预测肺癌中的基因表达。

BMC Bioinformatics. 2015;16 Suppl 5(Suppl 5):S10. doi: 10.1186/1471-2105-16-S5-S10. Epub 2015 Mar 18.

Advances in the profiling of DNA modifications: cytosine methylation and beyond.DNA 修饰谱分析的进展：胞嘧啶甲基化及其他。

Nat Rev Genet. 2014 Oct;15(10):647-61. doi: 10.1038/nrg3772. Epub 2014 Aug 27.

DMRforPairs: identifying differentially methylated regions between unique samples using array based methylation profiles.DMRforPairs：基于数组甲基化谱识别独特样本之间差异甲基化区域。

BMC Bioinformatics. 2014 May 15;15:141. doi: 10.1186/1471-2105-15-141.

DNA hypermethylation and DNA hypomethylation is present at different loci in chronic kidney disease.DNA 超甲基化和 DNA 低甲基化存在于慢性肾脏病的不同部位。

Epigenetics. 2014 Mar;9(3):366-76. doi: 10.4161/epi.27161. Epub 2013 Nov 19.

A prognostic DNA methylation signature for stage I non-small-cell lung cancer.用于 I 期非小细胞肺癌的预后 DNA 甲基化特征。

J Clin Oncol. 2013 Nov 10;31(32):4140-7. doi: 10.1200/JCO.2012.48.5516. Epub 2013 Sep 30.

DNA methylation biomarkers as diagnostic and prognostic tools in colorectal cancer.DNA 甲基化生物标志物作为结直肠癌的诊断和预后工具。

J Mol Med (Berl). 2013 Nov;91(11):1249-56. doi: 10.1007/s00109-013-1088-z. Epub 2013 Sep 21.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

BioMethyl：一个用于 DNA 甲基化数据生物学解释的 R 包。

BioMethyl: an R package for biological interpretation of DNA methylation data.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

补充信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献