AgIn：测量单个重复元件的CpG甲基化情况

AgIn: measuring the landscape of CpG methylation of individual repetitive elements.

作者信息

Suzuki Yuta, Korlach Jonas, Turner Stephen W, Tsukahara Tatsuya, Taniguchi Junko, Qu Wei, Ichikawa Kazuki, Yoshimura Jun, Yurino Hideaki, Takahashi Yuji, Mitsui Jun, Ishiura Hiroyuki, Tsuji Shoji, Takeda Hiroyuki, Morishita Shinichi

机构信息

Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8583, Japan.

Pacific Biosciences, Menlo Park, CA 94025, USA.

出版信息

Bioinformatics. 2016 Oct 1;32(19):2911-9. doi: 10.1093/bioinformatics/btw360. Epub 2016 Jun 17.

DOI:10.1093/bioinformatics/btw360

PMID:27318202

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5039925/

Abstract

MOTIVATION

Determining the methylation state of regions with high copy numbers is challenging for second-generation sequencing, because the read length is insufficient to map reads uniquely, especially when repetitive regions are long and nearly identical to each other. Single-molecule real-time (SMRT) sequencing is a promising method for observing such regions, because it is not vulnerable to GC bias, it produces long read lengths, and its kinetic information is sensitive to DNA modifications.

RESULTS

We propose a novel linear-time algorithm that combines the kinetic information for neighboring CpG sites and increases the confidence in identifying the methylation states of those sites. Using a practical read coverage of ∼30-fold from an inbred strain medaka (Oryzias latipes), we observed that both the sensitivity and precision of our method on individual CpG sites were ∼93.7%. We also observed a high correlation coefficient (R = 0.884) between our method and bisulfite sequencing, and for 92.0% of CpG sites, methylation levels ranging over [0,1] were in concordance within an acceptable difference 0.25. Using this method, we characterized the landscape of the methylation status of repetitive elements, such as LINEs, in the human genome, thereby revealing the strong correlation between CpG density and hypomethylation and detecting hypomethylation hot spots of LTRs and LINEs. We uncovered the methylation states for nearly identical active transposons, two novel LINE insertions of identity ∼99% and length 6050 base pairs (bp) in the human genome, and 16 Tol2 elements of identity >99.8% and length 4682 bp in the medaka genome.

AVAILABILITY AND IMPLEMENTATION

AgIn (Aggregate on Intervals) is available at: https://github.com/hacone/AgIn

CONTACT

ysuzuki@cb.k.u-tokyo.ac.jp or moris@cb.k.u-tokyo.ac.jp

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

对于第二代测序而言，确定高拷贝数区域的甲基化状态具有挑战性，因为读长不足以唯一地映射 reads，特别是当重复区域较长且彼此几乎相同时。单分子实时（SMRT）测序是观察此类区域的一种有前途的方法，因为它不易受 GC 偏差影响，能产生长读长，且其动力学信息对 DNA 修饰敏感。

结果

我们提出了一种新颖的线性时间算法，该算法结合了相邻 CpG 位点的动力学信息，并提高了识别这些位点甲基化状态的置信度。使用来自近交系青鳉（Oryzias latipes）的约 30 倍实际读覆盖度，我们观察到我们的方法在单个 CpG 位点上的灵敏度和精度均约为 93.7%。我们还观察到我们的方法与亚硫酸氢盐测序之间具有较高的相关系数（R = 0.884），并且对于 92.0%的 CpG 位点，范围在[0,1]内的甲基化水平在可接受差异 0.25 内是一致的。使用这种方法，我们对人类基因组中重复元件（如 LINEs）的甲基化状态格局进行了表征，从而揭示了 CpG 密度与低甲基化之间的强相关性，并检测到 LTRs 和 LINEs 的低甲基化热点。我们揭示了人类基因组中几乎相同的活跃转座子、两个同一性约为 99%且长度为 6050 碱基对（bp）的新型 LINE 插入以及青鳉基因组中 16 个同一性>99.8%且长度为 4682 bp 的 Tol2 元件的甲基化状态。

可用性和实现方式

AgIn（区间聚合）可在以下网址获取：https://github.com/hacone/AgIn

联系方式

ysuzuki@cb.k.u-tokyo.ac.jp 或 moris@cb.k.u-tokyo.ac.jp

补充信息

补充数据可在《生物信息学》在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71d8/5039925/396fc7c3abef/btw360f1p.jpg

相似文献

AgIn: measuring the landscape of CpG methylation of individual repetitive elements.

Bioinformatics. 2016 Oct 1;32(19):2911-9. doi: 10.1093/bioinformatics/btw360. Epub 2016 Jun 17.

Quantitative and multiplexed DNA methylation analysis using long-read single-molecule real-time bisulfite sequencing (SMRT-BS).

BMC Genomics. 2015 May 6;16(1):350. doi: 10.1186/s12864-015-1572-7.

A comprehensive evaluation of alignment software for reduced representation bisulfite sequencing data.

Bioinformatics. 2018 Aug 15;34(16):2715-2723. doi: 10.1093/bioinformatics/bty174.

DNA Methylation Profiling Using Long-Read Single Molecule Real-Time Bisulfite Sequencing (SMRT-BS).

Methods Mol Biol. 2017;1654:125-134. doi: 10.1007/978-1-4939-7231-9_8.

MeConcord: a new metric to quantitatively characterize DNA methylation heterogeneity across reads and CpG sites.

Bioinformatics. 2022 Jun 24;38(Suppl 1):i307-i315. doi: 10.1093/bioinformatics/btac248.

Using local alignment to enhance single-cell bisulfite sequencing data efficiency.

Bioinformatics. 2019 Sep 15;35(18):3273-3278. doi: 10.1093/bioinformatics/btz125.

MeDEStrand: an improved method to infer genome-wide absolute methylation levels from DNA enrichment data.

BMC Bioinformatics. 2018 Dec 22;19(1):540. doi: 10.1186/s12859-018-2574-7.

Methylation status of individual CpG sites within Alu elements in the human genome and Alu hypomethylation in gastric carcinomas.

BMC Cancer. 2010 Feb 17;10:44. doi: 10.1186/1471-2407-10-44.

Genome-wide genetic variations are highly correlated with proximal DNA methylation patterns.

Genome Res. 2012 Aug;22(8):1419-25. doi: 10.1101/gr.140236.112. Epub 2012 Jun 11.

Estimating absolute methylation levels at single-CpG resolution from methylation enrichment and restriction enzyme sequencing methods.

Genome Res. 2013 Sep;23(9):1541-53. doi: 10.1101/gr.152231.112. Epub 2013 Jun 26.

引用本文的文献

Computational analysis of DNA methylation from long-read sequencing.

Nat Rev Genet. 2025 Mar 28. doi: 10.1038/s41576-025-00822-5.

Conformational Dynamics of Mitochondrial Inorganic Pyrophosphatase hPPA2 and Its Changes Caused by Pathogenic Mutations.

Life (Basel). 2025 Jan 15;15(1):100. doi: 10.3390/life15010100.

A comparison of methods for detecting DNA methylation from long-read sequencing of human genomes.

Genome Biol. 2024 Mar 11;25(1):69. doi: 10.1186/s13059-024-03207-9.

DNA 5-methylcytosine detection and methylation phasing using PacBio circular consensus sequencing.

Nat Commun. 2023 Jul 8;14(1):4054. doi: 10.1038/s41467-023-39784-9.

Single-Cell DNA Methylation Analysis in Cancer.

Cancers (Basel). 2022 Dec 14;14(24):6171. doi: 10.3390/cancers14246171.

High contiguity de novo genome assembly and DNA modification analyses for the fungus fly, Sciara coprophila, using single-molecule sequencing.

BMC Genomics. 2021 Sep 6;22(1):643. doi: 10.1186/s12864-021-07926-2.

Long-read human genome sequencing and its applications.

Nat Rev Genet. 2020 Oct;21(10):597-614. doi: 10.1038/s41576-020-0236-x. Epub 2020 Jun 5.

Mapping chromatin modifications at the single cell level.

Development. 2019 Jun 27;146(12):dev170217. doi: 10.1242/dev.170217.

Long-read single-molecule maps of the functional methylome.

Genome Res. 2019 Apr;29(4):646-656. doi: 10.1101/gr.240739.118. Epub 2019 Mar 7.

A Statistical Method for Observing Personal Diploid Methylomes and Transcriptomes with Single-Molecule Real-Time Sequencing.

Genes (Basel). 2018 Sep 19;9(9):460. doi: 10.3390/genes9090460.

本文引用的文献

A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data.

Epigenetics. 2015;10(7):662-9. doi: 10.1080/15592294.2015.1057384.

Quantitative and multiplexed DNA methylation analysis using long-read single-molecule real-time bisulfite sequencing (SMRT-BS).

BMC Genomics. 2015 May 6;16(1):350. doi: 10.1186/s12864-015-1572-7.

DNA Methylation on N6-Adenine in C. elegans.

Cell. 2015 May 7;161(4):868-78. doi: 10.1016/j.cell.2015.04.005. Epub 2015 Apr 30.

Detecting epigenetic motifs in low coverage and metagenomics settings.

BMC Bioinformatics. 2014;15 Suppl 9(Suppl 9):S16. doi: 10.1186/1471-2105-15-S9-S16. Epub 2014 Sep 10.

Mobile DNA in cancer. Extensive transduction of nonrepetitive DNA mediated by L1 retrotransposition in cancer genomes.

Science. 2014 Aug 1;345(6196):1251343. doi: 10.1126/science.1251343.

Retrotransposition in tumors and brains.

Mob DNA. 2014 Apr 7;5:11. doi: 10.1186/1759-8753-5-11. eCollection 2014.

Epigenomic analysis of multilineage differentiation of human embryonic stem cells.

Cell. 2013 May 23;153(5):1134-48. doi: 10.1016/j.cell.2013.04.022. Epub 2013 May 9.

Transcriptional and epigenetic dynamics during specification of human embryonic stem cells.

Cell. 2013 May 23;153(5):1149-63. doi: 10.1016/j.cell.2013.04.037. Epub 2013 May 9.

Detecting DNA modifications from SMRT sequencing data by modeling sequence context dependence of polymerase kinetic.

PLoS Comput Biol. 2013;9(3):e1002935. doi: 10.1371/journal.pcbi.1002935. Epub 2013 Mar 14.

Enhanced 5-methylcytosine detection in single-molecule, real-time sequencing via Tet1 oxidation.

BMC Biol. 2013 Jan 22;11:4. doi: 10.1186/1741-7007-11-4.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

AgIn：测量单个重复元件的CpG甲基化情况

AgIn: measuring the landscape of CpG methylation of individual repetitive elements.

作者信息

机构信息

Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8583, Japan.

Pacific Biosciences, Menlo Park, CA 94025, USA.