基于拉普拉斯逼近的亚硫酸氢盐测序甲基化数据的关联分析。

Association testing of bisulfite-sequencing methylation data via a Laplace approximation.

机构信息

Statistics Department, Tel Aviv University, Tel Aviv, Israel.

Computer Science Department, Technion - Israel Institute of Technology, Haifa, Israel.

出版信息

Bioinformatics. 2017 Jul 15;33(14):i325-i332. doi: 10.1093/bioinformatics/btx248.

DOI:10.1093/bioinformatics/btx248

PMID:28881982

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5870555/

Abstract

MOTIVATION

Epigenome-wide association studies can provide novel insights into the regulation of genes involved in traits and diseases. The rapid emergence of bisulfite-sequencing technologies enables performing such genome-wide studies at the resolution of single nucleotides. However, analysis of data produced by bisulfite-sequencing poses statistical challenges owing to low and uneven sequencing depth, as well as the presence of confounding factors. The recently introduced Mixed model Association for Count data via data AUgmentation (MACAU) can address these challenges via a generalized linear mixed model when confounding can be encoded via a single variance component. However, MACAU cannot be used in the presence of multiple variance components. Additionally, MACAU uses a computationally expensive Markov Chain Monte Carlo (MCMC) procedure, which cannot directly approximate the model likelihood.

RESULTS

We present a new method, Mixed model Association via a Laplace ApproXimation (MALAX), that is more computationally efficient than MACAU and allows to model multiple variance components. MALAX uses a Laplace approximation rather than MCMC based approximations, which enables to directly approximate the model likelihood. Through an extensive analysis of simulated and real data, we demonstrate that MALAX successfully addresses statistical challenges introduced by bisulfite-sequencing while controlling for complex sources of confounding, and can be over 50% faster than the state of the art.

AVAILABILITY AND IMPLEMENTATION

The full source code of MALAX is available at https://github.com/omerwe/MALAX .

CONTACT

omerw@cs.technion.ac.il or ehalperin@cs.ucla.edu.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

全基因组关联研究可以为涉及性状和疾病的基因调控提供新的见解。亚硫酸氢盐测序技术的快速出现使我们能够以单个核苷酸的分辨率进行此类全基因组研究。然而，由于测序深度低且不均匀，以及存在混杂因素，分析亚硫酸氢盐测序产生的数据存在统计学挑战。最近引入的通过数据增强进行计数数据混合模型关联（MACAU）可以通过广义线性混合模型来解决这些挑战，当混杂因素可以通过单个方差分量进行编码时。然而，当存在多个方差分量时，MACAU 无法使用。此外，MACAU 使用计算成本高的马尔可夫链蒙特卡罗（MCMC）过程，该过程不能直接逼近模型似然。

结果

我们提出了一种新方法，即通过拉普拉斯逼近进行混合模型关联（MALAX），该方法比 MACAU 更具计算效率，并允许对多个方差分量进行建模。MALAX 使用拉普拉斯逼近而不是基于 MCMC 的逼近，这使得能够直接逼近模型似然。通过对模拟和真实数据的广泛分析，我们证明 MALAX 成功地解决了亚硫酸氢盐测序引入的统计挑战，同时控制了复杂的混杂来源，并且可以比最先进的方法快 50%以上。

可用性和实现

MALAX 的完整源代码可在 https://github.com/omerwe/MALAX 上获得。

联系方式

omerw@cs.technion.ac.il 或 ehalperin@cs.ucla.edu。

补充信息

补充数据可在生物信息学在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ca7/5870555/c20df6d392e3/btx248f1.jpg

相似文献

Association testing of bisulfite-sequencing methylation data via a Laplace approximation.基于拉普拉斯逼近的亚硫酸氢盐测序甲基化数据的关联分析。

Bioinformatics. 2017 Jul 15;33(14):i325-i332. doi: 10.1093/bioinformatics/btx248.

A comprehensive evaluation of alignment software for reduced representation bisulfite sequencing data.简化基因组重亚硫酸盐测序数据比对软件的综合评估。

Bioinformatics. 2018 Aug 15;34(16):2715-2723. doi: 10.1093/bioinformatics/bty174.

A Flexible, Efficient Binomial Mixed Model for Identifying Differential DNA Methylation in Bisulfite Sequencing Data.一种用于在亚硫酸氢盐测序数据中识别差异DNA甲基化的灵活、高效二项混合模型

PLoS Genet. 2015 Nov 24;11(11):e1005650. doi: 10.1371/journal.pgen.1005650. eCollection 2015 Nov.

B-SOLANA: an approach for the analysis of two-base encoding bisulfite sequencing data.B-SOLANA：一种用于分析双碱基编码亚硫酸氢盐测序数据的方法。

Bioinformatics. 2012 Feb 1;28(3):428-9. doi: 10.1093/bioinformatics/btr660. Epub 2011 Dec 6.

A system for exact and approximate genetic linkage analysis of SNP data in large pedigrees.用于大型家系中 SNP 数据的精确和近似遗传连锁分析的系统。

Bioinformatics. 2013 Jan 15;29(2):197-205. doi: 10.1093/bioinformatics/bts658. Epub 2012 Nov 18.

GLINT: a user-friendly toolset for the analysis of high-throughput DNA-methylation array data.GLINT：一个用于分析高通量DNA甲基化阵列数据的用户友好型工具集。

Bioinformatics. 2017 Jun 15;33(12):1870-1872. doi: 10.1093/bioinformatics/btx059.

A new approach to decode DNA methylome and genomic variants simultaneously from double strand bisulfite sequencing.一种从双链亚硫酸氢盐测序中同时解码 DNA 甲基化组和基因组变异的新方法。

Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab201.

PCBS: an R package for fast and accurate analysis of bisulfite sequencing data.PCBS：一个用于快速准确分析亚硫酸氢盐测序数据的 R 包。

Bioinformatics. 2024 Oct 1;40(10). doi: 10.1093/bioinformatics/btae593.

Detection of differentially methylated regions in whole genome bisulfite sequencing data using local Getis-Ord statistics.使用局部Getis-Ord统计量检测全基因组亚硫酸氢盐测序数据中的差异甲基化区域。

Bioinformatics. 2016 Nov 15;32(22):3396-3404. doi: 10.1093/bioinformatics/btw497. Epub 2016 Aug 4.

LuxUS: DNA methylation analysis using generalized linear mixed model with spatial correlation.LuxUS：使用具有空间相关性的广义线性混合模型进行DNA甲基化分析。

Bioinformatics. 2020 Nov 1;36(17):4535-4543. doi: 10.1093/bioinformatics/btaa539.

引用本文的文献

Methylation Analysis of Urinary Sample in Non-Muscle-Invasive Bladder Carcinoma: Frequency and Management of Invalid Result.非肌层浸润性膀胱癌尿液样本的甲基化分析：无效结果的发生率及处理

Biomedicines. 2023 Dec 12;11(12):3288. doi: 10.3390/biomedicines11123288.

DNA methylomic homogeneity and heterogeneity in muscles and testes throughout pig adulthood.猪成年期肌肉和睾丸中的 DNA 甲基组学均一性和异质性。

Aging (Albany NY). 2020 Nov 20;12(24):25412-25431. doi: 10.18632/aging.104143.

A novel statistical method for modeling covariate effects in bisulfite sequencing derived measures of DNA methylation.一种用于建模亚硫酸氢盐测序衍生的 DNA 甲基化测量中协变量效应的新统计方法。

Biometrics. 2021 Jun;77(2):424-438. doi: 10.1111/biom.13307. Epub 2020 Jun 5.

IMAGE: high-powered detection of genetic effects on DNA methylation using integrated methylation QTL mapping and allele-specific analysis.图像：使用整合的甲基化 QTL 图谱和等位基因特异性分析技术，对 DNA 甲基化的遗传效应进行高灵敏度检测。

Genome Biol. 2019 Oct 24;20(1):220. doi: 10.1186/s13059-019-1813-1.

Heritability estimation and differential analysis of count data with generalized linear mixed models in genomic sequencing studies.基于广义线性混合模型的基因组测序研究中计数数据的遗传力估计和差异分析。

Bioinformatics. 2019 Feb 1;35(3):487-496. doi: 10.1093/bioinformatics/bty644.

本文引用的文献

Differential expression analysis for RNAseq using Poisson mixed models.使用泊松混合模型对RNA测序数据进行差异表达分析。

Nucleic Acids Res. 2017 Jun 20;45(11):e106. doi: 10.1093/nar/gkx204.

Paradoxical Hypersusceptibility of Drug-resistant Mycobacteriumtuberculosis to β-lactam Antibiotics.耐药结核分枝杆菌对β-内酰胺类抗生素的反常高敏感性。

EBioMedicine. 2016 Jul;9:170-179. doi: 10.1016/j.ebiom.2016.05.041. Epub 2016 Jun 1.

Coming of age: ten years of next-generation sequencing technologies.成年：下一代测序技术的十年

Nat Rev Genet. 2016 May 17;17(6):333-51. doi: 10.1038/nrg.2016.49.

Sparse PCA corrects for cell type heterogeneity in epigenome-wide association studies.稀疏主成分分析在全表观基因组关联研究中校正细胞类型异质性。

Nat Methods. 2016 May;13(5):443-5. doi: 10.1038/nmeth.3809. Epub 2016 Mar 28.

Control for Population Structure and Relatedness for Binary Traits in Genetic Association Studies via Logistic Mixed Models.通过逻辑混合模型在遗传关联研究中对二元性状的群体结构和相关性进行控制。

Am J Hum Genet. 2016 Apr 7;98(4):653-66. doi: 10.1016/j.ajhg.2016.02.012. Epub 2016 Mar 24.

Retrospective Binary-Trait Association Test Elucidates Genetic Architecture of Crohn Disease.回顾性二元性状关联测试揭示克罗恩病的遗传结构。

Am J Hum Genet. 2016 Feb 4;98(2):243-55. doi: 10.1016/j.ajhg.2015.12.012. Epub 2016 Jan 28.

PLoS Genet. 2015 Nov 24;11(11):e1005650. doi: 10.1371/journal.pgen.1005650. eCollection 2015 Nov.

Accurate liability estimation improves power in ascertained case-control studies.准确的责任估计可提高确定病例对照研究的效力。

Nat Methods. 2015 Apr;12(4):332-4. doi: 10.1038/nmeth.3285. Epub 2015 Feb 9.

Further improvements to linear mixed models for genome-wide association studies.全基因组关联研究线性混合模型的进一步改进。

Sci Rep. 2014 Nov 12;4:6874. doi: 10.1038/srep06874.

Using beta-binomial regression for high-precision differential methylation analysis in multifactor whole-genome bisulfite sequencing experiments.使用贝塔二项式回归进行多因素全基因组亚硫酸氢盐测序实验中的高精度差异甲基化分析。

BMC Bioinformatics. 2014 Jun 24;15:215. doi: 10.1186/1471-2105-15-215.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于拉普拉斯逼近的亚硫酸氢盐测序甲基化数据的关联分析。

Association testing of bisulfite-sequencing methylation data via a Laplace approximation.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

CONTACT

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

联系方式

补充信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献