• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

iCpG-Pos:一种使用单细胞全基因组序列数据上的位置特征来识别 CpG 位点的准确计算方法。

iCpG-Pos: an accurate computational approach for identification of CpG sites using positional features on single-cell whole genome sequence data.

机构信息

Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, South Korea.

College of Information Technology in the United Arab Emirates University (UAEU), Abu Dhabi 15551, UAE.

出版信息

Bioinformatics. 2023 Aug 1;39(8). doi: 10.1093/bioinformatics/btad474.

DOI:10.1093/bioinformatics/btad474
PMID:37555812
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10444964/
Abstract

MOTIVATION

The investigation of DNA methylation can shed light on the processes underlying human well-being and help determine overall human health. However, insufficient coverage makes it challenging to implement single-stranded DNA methylation sequencing technologies, highlighting the need for an efficient prediction model. Models are required to create an understanding of the underlying biological systems and to project single-cell (methylated) data accurately.

RESULTS

In this study, we developed positional features for predicting CpG sites. Positional characteristics of the sequence are derived using data from CpG regions and the separation between nearby CpG sites. Multiple optimized classifiers and different ensemble learning approaches are evaluated. The OPTUNA framework is used to optimize the algorithms. The CatBoost algorithm followed by the stacking algorithm outperformed existing DNA methylation identifiers.

AVAILABILITY AND IMPLEMENTATION

The data and methodologies used in this study are openly accessible to the research community. Researchers can access the positional features and algorithms used for predicting CpG site methylation patterns. To achieve superior performance, we employed the CatBoost algorithm followed by the stacking algorithm, which outperformed existing DNA methylation identifiers. The proposed iCpG-Pos approach utilizes only positional features, resulting in a substantial reduction in computational complexity compared to other known approaches for detecting CpG site methylation patterns. In conclusion, our study introduces a novel approach, iCpG-Pos, for predicting CpG site methylation patterns. By focusing on positional features, our model offers both accuracy and efficiency, making it a promising tool for advancing DNA methylation research and its applications in human health and well-being.

摘要

动机

对 DNA 甲基化的研究可以揭示人类健康的潜在过程,并有助于确定整体人类健康。然而,由于覆盖度不足,实施单链 DNA 甲基化测序技术具有挑战性,这凸显了对高效预测模型的需求。模型需要帮助我们理解潜在的生物系统,并准确预测单细胞(甲基化)数据。

结果

在这项研究中,我们开发了用于预测 CpG 位点的位置特征。序列的位置特征是使用 CpG 区域的数据和附近 CpG 位点之间的分隔推导出来的。评估了多个优化分类器和不同的集成学习方法。使用 OPTUNA 框架来优化算法。CatBoost 算法后面跟着堆叠算法,其表现优于现有的 DNA 甲基化标识符。

可用性和实现

本研究中使用的数据和方法对研究界开放。研究人员可以访问用于预测 CpG 位点甲基化模式的位置特征和算法。为了获得卓越的性能,我们采用了 CatBoost 算法后面跟着堆叠算法,其表现优于现有的 DNA 甲基化标识符。所提出的 iCpG-Pos 方法仅使用位置特征,与其他已知的检测 CpG 位点甲基化模式的方法相比,大大降低了计算复杂度。总之,我们的研究引入了一种新的方法 iCpG-Pos,用于预测 CpG 位点的甲基化模式。通过关注位置特征,我们的模型提供了准确性和效率,这使其成为推进 DNA 甲基化研究及其在人类健康和福祉中的应用的有前途的工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c106/10444964/9596c7d3dcd6/btad474f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c106/10444964/ae42c47186d4/btad474f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c106/10444964/cd925bafd3e1/btad474f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c106/10444964/0161bb4a625c/btad474f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c106/10444964/8f794071ee48/btad474f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c106/10444964/92149e408f01/btad474f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c106/10444964/9596c7d3dcd6/btad474f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c106/10444964/ae42c47186d4/btad474f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c106/10444964/cd925bafd3e1/btad474f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c106/10444964/0161bb4a625c/btad474f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c106/10444964/8f794071ee48/btad474f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c106/10444964/92149e408f01/btad474f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c106/10444964/9596c7d3dcd6/btad474f6.jpg

相似文献

1
iCpG-Pos: an accurate computational approach for identification of CpG sites using positional features on single-cell whole genome sequence data.iCpG-Pos:一种使用单细胞全基因组序列数据上的位置特征来识别 CpG 位点的准确计算方法。
Bioinformatics. 2023 Aug 1;39(8). doi: 10.1093/bioinformatics/btad474.
2
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
3
Short-Term Memory Impairment短期记忆障碍
4
Can a Liquid Biopsy Detect Circulating Tumor DNA With Low-passage Whole-genome Sequencing in Patients With a Sarcoma? A Pilot Evaluation.液体活检能否通过低深度全基因组测序检测肉瘤患者的循环肿瘤DNA?一项初步评估。
Clin Orthop Relat Res. 2025 Jan 1;483(1):39-48. doi: 10.1097/CORR.0000000000003161. Epub 2024 Jun 21.
5
AI-based Hepatic Steatosis Detection and Integrated Hepatic Assessment from Cardiac CT Attenuation Scans Enhances All-cause Mortality Risk Stratification: A Multi-center Study.基于人工智能的心脏CT衰减扫描检测肝脂肪变性及综合肝脏评估可增强全因死亡风险分层:一项多中心研究
medRxiv. 2025 Jun 11:2025.06.09.25329157. doi: 10.1101/2025.06.09.25329157.
6
Community views on mass drug administration for soil-transmitted helminths: a qualitative evidence synthesis.社区对土壤传播蠕虫群体药物给药的看法:定性证据综合分析
Cochrane Database Syst Rev. 2025 Jun 20;6:CD015794. doi: 10.1002/14651858.CD015794.pub2.
7
Aryana-bs: context-aware alignment of bisulfite-sequencing reads.Aryana-bs:亚硫酸氢盐测序读数的上下文感知比对
BMC Bioinformatics. 2025 Jul 21;26(1):188. doi: 10.1186/s12859-025-06182-5.
8
iACP-DPNet: a dual-pooling causal dilated convolutional network for interpretable anticancer peptide identification.iACP-DPNet:一种用于可解释抗癌肽识别的双池因果扩张卷积网络。
Funct Integr Genomics. 2025 Jul 4;25(1):147. doi: 10.1007/s10142-025-01641-x.
9
Molecular feature-based classification of retroperitoneal liposarcoma: a prospective cohort study.基于分子特征的腹膜后脂肪肉瘤分类:一项前瞻性队列研究。
Elife. 2025 May 23;14:RP100887. doi: 10.7554/eLife.100887.
10
123I-MIBG scintigraphy and 18F-FDG-PET imaging for diagnosing neuroblastoma.用于诊断神经母细胞瘤的123I-间碘苄胍闪烁扫描术和18F-氟代脱氧葡萄糖正电子发射断层显像
Cochrane Database Syst Rev. 2015 Sep 29;2015(9):CD009263. doi: 10.1002/14651858.CD009263.pub2.

引用本文的文献

1
CpGFuse: a holistic approach for accurate identification of methylation states of DNA CpG sites.CpGFuse:一种用于准确识别DNA CpG位点甲基化状态的整体方法。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf063.

本文引用的文献

1
DL-m6A: Identification of N6-Methyladenosine Sites in Mammals Using Deep Learning Based on Different Encoding Schemes.DL-m6A:基于不同编码方案利用深度学习识别哺乳动物中的N6-甲基腺苷位点
IEEE/ACM Trans Comput Biol Bioinform. 2023 Mar-Apr;20(2):904-911. doi: 10.1109/TCBB.2022.3192572. Epub 2023 Apr 3.
2
i6mA-Caps: a CapsuleNet-based framework for identifying DNA N6-methyladenine sites.i6mA-Caps:一种基于胶囊网络的 DNA N6-甲基腺嘌呤位点识别框架。
Bioinformatics. 2022 Aug 10;38(16):3885-3891. doi: 10.1093/bioinformatics/btac434.
3
DCNN-4mC: Densely connected neural network based N4-methylcytosine site prediction in multiple species.
DCNN-4mC:基于密集连接神经网络的多物种N4-甲基胞嘧啶位点预测
Comput Struct Biotechnol J. 2021 Nov 1;19:6009-6019. doi: 10.1016/j.csbj.2021.10.034. eCollection 2021.
4
The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation.在二分类混淆矩阵评估中,马修斯相关系数(MCC)比平衡准确率、庄家知情度和标记度更可靠。
BioData Min. 2021 Feb 4;14(1):13. doi: 10.1186/s13040-021-00244-z.
5
A novel computational strategy for DNA methylation imputation using mixture regression model (MRM).一种基于混合回归模型(MRM)的新型 DNA 甲基化推断计算策略。
BMC Bioinformatics. 2020 Dec 1;21(1):552. doi: 10.1186/s12859-020-03865-z.
6
MethylNet: an automated and modular deep learning approach for DNA methylation analysis.MethylNet:一种用于 DNA 甲基化分析的自动化和模块化深度学习方法。
BMC Bioinformatics. 2020 Mar 17;21(1):108. doi: 10.1186/s12859-020-3443-8.
7
LightCpG: a multi-view CpG sites detection on single-cell whole genome sequence data.LightCpG:一种基于单细胞全基因组序列数据的多视角 CpG 位点检测方法。
BMC Genomics. 2019 Apr 23;20(1):306. doi: 10.1186/s12864-019-5654-9.
8
Melissa: Bayesian clustering and imputation of single-cell methylomes.梅利莎:单细胞甲基化组的贝叶斯聚类和插补。
Genome Biol. 2019 Mar 21;20(1):61. doi: 10.1186/s13059-019-1665-8.
9
Missing value estimation methods for DNA methylation data.DNA 甲基化数据的缺失值估计方法。
Bioinformatics. 2019 Oct 1;35(19):3786-3793. doi: 10.1093/bioinformatics/btz134.
10
BoostMe accurately predicts DNA methylation values in whole-genome bisulfite sequencing of multiple human tissues.BoostMe 能够准确预测多种人类组织全基因组亚硫酸氢盐测序中的 DNA 甲基化值。
BMC Genomics. 2018 May 23;19(1):390. doi: 10.1186/s12864-018-4766-y.