Suppr超能文献

iSS-PC:使用深度稀疏自动编码器通过物理化学性质识别剪接位点。

iSS-PC: Identifying Splicing Sites via Physical-Chemical Properties Using Deep Sparse Auto-Encoder.

作者信息

Xu Zhao-Chun, Wang Peng, Qiu Wang-Ren, Xiao Xuan

机构信息

Computer Department, Jing-De-Zhen Ceramic Institute, Jing-De-Zhen, 333403, China.

Department of Computer Science and Bond Life Science Center, University of Missouri, Columbia, MO, USA.

出版信息

Sci Rep. 2017 Aug 15;7(1):8222. doi: 10.1038/s41598-017-08523-8.

Abstract

Gene splicing is one of the most significant biological processes in eukaryotic gene expression, such as RNA splicing, which can cause a pre-mRNA to produce one or more mature messenger RNAs containing the coded information with multiple biological functions. Thus, identifying splicing sites in DNA/RNA sequences is significant for both the bio-medical research and the discovery of new drugs. However, it is expensive and time consuming based only on experimental technique, so new computational methods are needed. To identify the splice donor sites and splice acceptor sites accurately and quickly, a deep sparse auto-encoder model with two hidden layers, called iSS-PC, was constructed based on minimum error law, in which we incorporated twelve physical-chemical properties of the dinucleotides within DNA into PseDNC to formulate given sequence samples via a battery of cross-covariance and auto-covariance transformations. In this paper, five-fold cross-validation test results based on the same benchmark data-sets indicated that the new predictor remarkably outperformed the existing prediction methods in this field. Furthermore, it is expected that many other related problems can be also studied by this approach. To implement classification accurately and quickly, an easy-to-use web-server for identifying slicing sites has been established for free access at: http://www.jci-bioinfo.cn/iSS-PC.

摘要

基因剪接是真核基因表达中最重要的生物学过程之一,比如RNA剪接,它能使一个前体信使核糖核酸产生一个或多个包含具有多种生物学功能编码信息的成熟信使核糖核酸。因此,识别DNA/RNA序列中的剪接位点对于生物医学研究和新药发现都具有重要意义。然而,仅基于实验技术成本高昂且耗时,所以需要新的计算方法。为了准确快速地识别剪接供体位点和剪接受体位点,基于最小误差定律构建了一种具有两个隐藏层的深度稀疏自动编码器模型,称为iSS-PC,在该模型中,我们将DNA中二核苷酸的十二种物理化学性质纳入伪二核苷酸组成(PseDNC),通过一系列交叉协方差和自协方差变换来构建给定的序列样本。本文基于相同的基准数据集进行的五重交叉验证测试结果表明,新的预测器显著优于该领域现有的预测方法。此外,预计该方法还可用于研究许多其他相关问题。为了准确快速地实现分类,已建立了一个易于使用的用于识别剪接位点的网络服务器,可免费访问:http://www.jci-bioinfo.cn/iSS-PC

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c3c5/5557945/bd1b9245d565/41598_2017_8523_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验