Suppr超能文献

BioMethyl:一个用于 DNA 甲基化数据生物学解释的 R 包。

BioMethyl: an R package for biological interpretation of DNA methylation data.

机构信息

Department of Molecular and Systems Biology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA.

Department of Biomedical Data Science, Geisel School of Medicine at Dartmouth, Lebanon, NH, USA.

出版信息

Bioinformatics. 2019 Oct 1;35(19):3635-3641. doi: 10.1093/bioinformatics/btz137.

Abstract

MOTIVATION

The accumulation of publicly available DNA methylation datasets has resulted in the need for tools to interpret the specific cellular phenotypes in bulk tissue data. Current approaches use either single differentially methylated CpG sites or differentially methylated regions that map to genes. However, these approaches may introduce biases in downstream analyses of biological interpretation, because of the variability in gene length. There is a lack of approaches to interpret DNA methylation effectively. Therefore, we have developed computational models to provide biological interpretation of relevant gene sets using DNA methylation data in the context of The Cancer Genome Atlas.

RESULTS

We illustrate that Biological interpretation of DNA Methylation (BioMethyl) utilizes the complete DNA methylation data for a given cancer type to reflect corresponding gene expression profiles and performs pathway enrichment analyses, providing unique biological insight. Using breast cancer as an example, BioMethyl shows high consistency in the identification of enriched biological pathways from DNA methylation data compared to the results calculated from RNA sequencing data. We find that 12 out of 14 pathways identified by BioMethyl are shared with those by using RNA-seq data, with a Jaccard score 0.8 for estrogen receptor (ER) positive samples. For ER negative samples, three pathways are shared in the two enrichments with a slight lower similarity (Jaccard score = 0.6). Using BioMethyl, we can successfully identify those hidden biological pathways in DNA methylation data when gene expression profile is lacking.

AVAILABILITY AND IMPLEMENTATION

BioMethyl R package is freely available in the GitHub repository (https://github.com/yuewangpanda/BioMethyl).

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

公开可用的 DNA 甲基化数据集的积累导致需要工具来解释批量组织数据中的特定细胞表型。当前的方法使用单个差异甲基化 CpG 位点或映射到基因的差异甲基化区域。然而,由于基因长度的可变性,这些方法可能会在下游生物学解释分析中引入偏差。目前缺乏有效的解释 DNA 甲基化的方法。因此,我们开发了计算模型,以便在癌症基因组图谱的背景下使用 DNA 甲基化数据对相关基因集进行生物学解释。

结果

我们说明了 DNA 甲基化的生物学解释(BioMethyl)利用给定癌症类型的完整 DNA 甲基化数据来反映相应的基因表达谱,并进行途径富集分析,从而提供独特的生物学见解。以乳腺癌为例,与从 RNA 测序数据计算得出的结果相比,BioMethyl 在从 DNA 甲基化数据中识别富集的生物学途径方面具有高度一致性。我们发现,BioMethyl 确定的 14 条途径中有 12 条与使用 RNA-seq 数据确定的途径相同,对于雌激素受体(ER)阳性样本,Jaccard 得分 0.8。对于 ER 阴性样本,在两个富集中共享三个途径,相似性略低(Jaccard 得分=0.6)。使用 BioMethyl,我们可以在缺乏基因表达谱的情况下成功地从 DNA 甲基化数据中识别出那些隐藏的生物学途径。

可用性和实现

BioMethyl R 包可在 GitHub 存储库(https://github.com/yuewangpanda/BioMethyl)中免费获得。

补充信息

补充数据可在生物信息学在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3bf/6761945/c6ae2ea60f95/btz137f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验