• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

pDHS-ELM:基于极限学习机的植物 DNA 酶 I 超敏位点计算预测器。

pDHS-ELM: computational predictor for plant DNase I hypersensitive sites based on extreme learning machines.

机构信息

Engineering Research Center of Internet of Things Technology Applications (Ministry of Education), School of Internet of Things Engineering, Jiangnan University, Wuxi, 214122, Jiangsu, China.

School of Medicine and Pharmaceuticals, Jiangnan University, Wuxi, 214122, Jiangsu, China.

出版信息

Mol Genet Genomics. 2018 Aug;293(4):1035-1049. doi: 10.1007/s00438-018-1436-3. Epub 2018 Mar 29.

DOI:10.1007/s00438-018-1436-3
PMID:29594496
Abstract

DNase I hypersensitive sites (DHSs) are hallmarks of chromatin zones containing transcriptional regulatory elements, making them critical in understanding the regulatory mechanisms of gene expression. Although large amounts of DHSs in the plant genome have been identified by high-throughput techniques, current DHSs obtained from experimental methods cover only a fraction of plant species and cell processes. Furthermore, these experimental methods are both time-consuming and expensive. Hence, it is urgent to develop automated computational means to efficiently and accurately predict DHSs in the plant genome. Recently, several methods have been proposed to predict the DHSs. However, all these methods took a lot of time to build the model, making them inappropriate for data with massive volume. In the present work, a new ensemble extreme learning machine (ELM)-based model called pDHS-ELM was proposed to predict the DHSs in the plant genome by fusing two different modes of pseudo-nucleotide composition. Here, two kinds of features including reverse complement kmer and pseudo-nucleotide composition were used to represent the DHSs. The ELM model was used to build the base classifiers. Then, an ensemble framework was employed to combine the outputs of these base classifiers. When applied to DHSs in Arabidopsis thaliana and rice (Oryza sativa) genome, the proposed method could obtain accuracies up to 88.48 and 87.58%, respectively. Compared with the state-of-the-art techniques, pDHS-ELM achieved higher sensitivity, specificity, and Matthew's correlation coefficient with much less training and test time. By employing pDHS-ELM, we identified 42,370 and 103,979 DHSs in A. thaliana and rice genome, respectively. The predicted DHSs were depleted of bulk nucleosomes and were tightly associated with transcription factors. Approximately 90% of the predicted DHSs were overlapped with transcription factors. Meanwhile, we demonstrated that the predicted DHSs were also associated with DNA methylation, nucleosome positioning/occupancy, and histone modification. This result suggests that pDHS-ELM can be considered as a new promising and powerful tool for transcriptional regulatory elements analysis. Our pDHS-ELM tool is available from the following website https://github.com/shanxinzhang/pDHS-ELM/ .

摘要

DNase I 超敏位点(DHSs)是含有转录调控元件的染色质区域的标志,对于理解基因表达的调控机制至关重要。尽管高通量技术已经鉴定出大量植物基因组中的 DHSs,但目前通过实验方法获得的 DHSs 仅涵盖了一部分植物物种和细胞过程。此外,这些实验方法既耗时又昂贵。因此,迫切需要开发自动化的计算方法来有效地、准确地预测植物基因组中的 DHSs。最近,已经提出了几种预测 DHSs 的方法。然而,所有这些方法都需要大量时间来构建模型,因此不适合处理大量数据。在本研究中,我们提出了一种新的基于集成极端学习机(ELM)的模型,称为 pDHS-ELM,该模型通过融合两种不同的拟核苷酸组成模式来预测植物基因组中的 DHSs。在这里,我们使用了两种特征,包括反向互补 kmer 和拟核苷酸组成,来表示 DHSs。ELM 模型被用于构建基本分类器。然后,采用集成框架来组合这些基本分类器的输出。当应用于拟南芥和水稻基因组中的 DHSs 时,该方法可以分别获得高达 88.48%和 87.58%的准确率。与最先进的技术相比,pDHS-ELM 具有更高的灵敏度、特异性和马修相关系数,同时训练和测试时间更少。使用 pDHS-ELM,我们分别在拟南芥和水稻基因组中鉴定出 42370 个和 103979 个 DHSs。预测的 DHSs 中去除了大量核小体,并且与转录因子紧密相关。大约 90%的预测 DHSs 与转录因子重叠。同时,我们还证明了预测的 DHSs 也与 DNA 甲基化、核小体定位/占据和组蛋白修饰有关。这一结果表明,pDHS-ELM 可以被视为一种分析转录调控元件的新的、有前途的强大工具。我们的 pDHS-ELM 工具可从以下网站获得:https://github.com/shanxinzhang/pDHS-ELM/ 。

相似文献

1
pDHS-ELM: computational predictor for plant DNase I hypersensitive sites based on extreme learning machines.pDHS-ELM:基于极限学习机的植物 DNA 酶 I 超敏位点计算预测器。
Mol Genet Genomics. 2018 Aug;293(4):1035-1049. doi: 10.1007/s00438-018-1436-3. Epub 2018 Mar 29.
2
pDHS-SVM: A prediction method for plant DNase I hypersensitive sites based on support vector machine.pDHS-SVM:一种基于支持向量机的植物DNase I超敏感位点预测方法。
J Theor Biol. 2017 Aug 7;426:126-133. doi: 10.1016/j.jtbi.2017.05.030. Epub 2017 May 26.
3
Prediction of DNase I hypersensitive sites in plant genome using multiple modes of pseudo components.利用多种伪组分模式预测植物基因组中的DNase I超敏感位点。
Anal Biochem. 2018 May 15;549:149-156. doi: 10.1016/j.ab.2018.03.025. Epub 2018 Mar 28.
4
pDHS-DSET: Prediction of DNase I hypersensitive sites in plant genome using DS evidence theory.pDHS-DSET:基于 DS 证据理论预测植物基因组中的 DNase I 超敏位点
Anal Biochem. 2019 Jan 1;564-565:54-63. doi: 10.1016/j.ab.2018.10.018. Epub 2018 Oct 17.
5
Use Chou's 5-steps rule to identify DNase I hypersensitive sites via dinucleotide property matrix and extreme gradient boosting.使用 Chou 的五步法则,通过二核苷酸属性矩阵和极端梯度提升来识别 DNase I 超敏位点。
Mol Genet Genomics. 2020 Nov;295(6):1431-1442. doi: 10.1007/s00438-020-01711-8. Epub 2020 Jul 19.
6
iDHS-EL: identifying DNase I hypersensitive sites by fusing three different modes of pseudo nucleotide composition into an ensemble learning framework.iDHS-EL:通过将三种不同模式的伪核苷酸组成融合到一个集成学习框架中,来识别 DNase I 超敏位点。
Bioinformatics. 2016 Aug 15;32(16):2411-8. doi: 10.1093/bioinformatics/btw186. Epub 2016 Apr 8.
7
DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest.DHSpred:基于支持向量机,利用随机森林选择的最优特征进行人类DNA酶I超敏感位点预测。
Oncotarget. 2017 Dec 8;9(2):1944-1956. doi: 10.18632/oncotarget.23099. eCollection 2018 Jan 5.
8
Deep learning for DNase I hypersensitive sites identification.深度学习在 DNase I 超敏位点识别中的应用。
BMC Genomics. 2018 Dec 31;19(Suppl 10):905. doi: 10.1186/s12864-018-5283-8.
9
Genome-wide nucleosome positioning is orchestrated by genomic regions associated with DNase I hypersensitivity in rice.全基因组核小体定位由水稻中与DNase I超敏反应相关的基因组区域协调。
PLoS Genet. 2014 May 22;10(5):e1004378. doi: 10.1371/journal.pgen.1004378. eCollection 2014 May.
10
PlantDHS: a database for DNase I hypersensitive sites in plants.植物DHS数据库:一个关于植物中DNase I超敏位点的数据库。
Nucleic Acids Res. 2016 Jan 4;44(D1):D1148-53. doi: 10.1093/nar/gkv962. Epub 2015 Sep 22.

引用本文的文献

1
Effects of weaning on intestinal longitudinal muscle-myenteric plexus function in piglets.断奶对仔猪肠纵向肌-肌间神经丛功能的影响。
Sci China Life Sci. 2024 Feb;67(2):379-390. doi: 10.1007/s11427-022-2391-x. Epub 2023 Oct 9.
2
GM-lncLoc: LncRNAs subcellular localization prediction based on graph neural network with meta-learning.GM-lncLoc:基于图神经网络与元学习的 lncRNAs 亚细胞定位预测。
BMC Genomics. 2023 Jan 28;24(1):52. doi: 10.1186/s12864-022-09034-1.
3
i6mA-DNCP: Computational Identification of DNA -Methyladenine Sites in the Rice Genome Using Optimized Dinucleotide-Based Features.

本文引用的文献

1
PCSD: a plant chromatin state database.PCSD:一个植物染色质状态数据库。
Nucleic Acids Res. 2018 Jan 4;46(D1):D1157-D1167. doi: 10.1093/nar/gkx919.
2
pDHS-SVM: A prediction method for plant DNase I hypersensitive sites based on support vector machine.pDHS-SVM:一种基于支持向量机的植物DNase I超敏感位点预测方法。
J Theor Biol. 2017 Aug 7;426:126-133. doi: 10.1016/j.jtbi.2017.05.030. Epub 2017 May 26.
3
Identifying N-methyladenosine sites using multi-interval nucleotide pair position specificity and support vector machine.
i6mA-DNCP:利用优化的二核苷酸特征计算鉴定水稻基因组中的 DNA-甲基腺嘌呤位点。
Genes (Basel). 2019 Oct 20;10(10):828. doi: 10.3390/genes10100828.
4
A Natural Isoquinoline Alkaloid With Antitumor Activity: Studies of the Biological Activities of Berberine.一种具有抗肿瘤活性的天然异喹啉生物碱:黄连素的生物活性研究。
Front Pharmacol. 2019 Feb 14;10:9. doi: 10.3389/fphar.2019.00009. eCollection 2019.
5
Deep learning for DNase I hypersensitive sites identification.深度学习在 DNase I 超敏位点识别中的应用。
BMC Genomics. 2018 Dec 31;19(Suppl 10):905. doi: 10.1186/s12864-018-5283-8.
利用多区间核苷酸对位置特异性和支持向量机鉴定 N6-甲基腺苷位点。
Sci Rep. 2017 Apr 25;7:46757. doi: 10.1038/srep46757.
4
iRNA-PseU: Identifying RNA pseudouridine sites.iRNA-PseU:识别RNA假尿苷位点。
Mol Ther Nucleic Acids. 2016;5(7):e332. doi: 10.1038/mtna.2016.37.
5
iATC-mISF: a multi-label classifier for predicting the classes of anatomical therapeutic chemicals.iATC-mISF:一种用于预测解剖治疗化学物质类别的多标签分类器。
Bioinformatics. 2017 Feb 1;33(3):341-346. doi: 10.1093/bioinformatics/btw644.
6
iPTM-mLys: identifying multiple lysine PTM sites and their different types.iPTM-mLys:鉴定多个赖氨酸 PTM 位点及其不同类型。
Bioinformatics. 2016 Oct 15;32(20):3116-3123. doi: 10.1093/bioinformatics/btw380. Epub 2016 Jun 22.
7
iDHS-EL: identifying DNase I hypersensitive sites by fusing three different modes of pseudo nucleotide composition into an ensemble learning framework.iDHS-EL:通过将三种不同模式的伪核苷酸组成融合到一个集成学习框架中,来识别 DNase I 超敏位点。
Bioinformatics. 2016 Aug 15;32(16):2411-8. doi: 10.1093/bioinformatics/btw186. Epub 2016 Apr 8.
8
pSuc-Lys: Predict lysine succinylation sites in proteins with PseAAC and ensemble random forest approach.pSuc-Lys:利用伪氨基酸组成和集成随机森林方法预测蛋白质中的赖氨酸琥珀酰化位点。
J Theor Biol. 2016 Apr 7;394:223-230. doi: 10.1016/j.jtbi.2016.01.020. Epub 2016 Jan 22.
9
Genome-wide detection of DNase I hypersensitive sites in single cells and FFPE tissue samples.在单细胞和 FFPE 组织样本中进行全基因组 DNase I 超敏位点检测。
Nature. 2015 Dec 3;528(7580):142-6. doi: 10.1038/nature15740.
10
PlantDHS: a database for DNase I hypersensitive sites in plants.植物DHS数据库:一个关于植物中DNase I超敏位点的数据库。
Nucleic Acids Res. 2016 Jan 4;44(D1):D1148-53. doi: 10.1093/nar/gkv962. Epub 2015 Sep 22.