Suppr超能文献

将进化信息和功能结构域纳入人类 RNA 剪接因子的识别中。

Incorporating evolutionary information and functional domains for identifying RNA splicing factors in humans.

机构信息

Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu, Taiwan.

出版信息

PLoS One. 2011;6(11):e27567. doi: 10.1371/journal.pone.0027567. Epub 2011 Nov 16.

Abstract

Regulation of pre-mRNA splicing is achieved through the interaction of RNA sequence elements and a variety of RNA-splicing related proteins (splicing factors). The splicing machinery in humans is not yet fully elucidated, partly because splicing factors in humans have not been exhaustively identified. Furthermore, experimental methods for splicing factor identification are time-consuming and lab-intensive. Although many computational methods have been proposed for the identification of RNA-binding proteins, there exists no development that focuses on the identification of RNA-splicing related proteins so far. Therefore, we are motivated to design a method that focuses on the identification of human splicing factors using experimentally verified splicing factors. The investigation of amino acid composition reveals that there are remarkable differences between splicing factors and non-splicing proteins. A support vector machine (SVM) is utilized to construct a predictive model, and the five-fold cross-validation evaluation indicates that the SVM model trained with amino acid composition could provide a promising accuracy (80.22%). Another basic feature, amino acid dipeptide composition, is also examined to yield a similar predictive performance to amino acid composition. In addition, this work presents that the incorporation of evolutionary information and domain information could improve the predictive performance. The constructed models have been demonstrated to effectively classify (73.65% accuracy) an independent data set of human splicing factors. The result of independent testing indicates that in silico identification could be a feasible means of conducting preliminary analyses of splicing factors and significantly reducing the number of potential targets that require further in vivo or in vitro confirmation.

摘要

前体 mRNA 剪接的调控是通过 RNA 序列元件与多种 RNA 剪接相关蛋白(剪接因子)相互作用实现的。人类的剪接机制尚未完全阐明,部分原因是人类的剪接因子尚未被彻底鉴定。此外,鉴定剪接因子的实验方法既耗时又费力。虽然已经提出了许多用于鉴定 RNA 结合蛋白的计算方法,但迄今为止还没有专门针对鉴定 RNA 剪接相关蛋白的开发。因此,我们有动机设计一种方法,使用经过实验验证的剪接因子来专门鉴定人类剪接因子。氨基酸组成的研究表明,剪接因子与非剪接蛋白之间存在显著差异。支持向量机(SVM)用于构建预测模型,五折交叉验证评估表明,使用氨基酸组成训练的 SVM 模型可以提供有前途的准确性(80.22%)。另一个基本特征,氨基酸二肽组成,也被检查以产生类似于氨基酸组成的预测性能。此外,本研究表明,结合进化信息和结构域信息可以提高预测性能。所构建的模型已被证明能够有效地对人类剪接因子的独立数据集进行分类(准确率为 73.65%)。独立测试的结果表明,基于计算的鉴定方法可能是进行剪接因子初步分析并显著减少需要进一步体内或体外确认的潜在靶标数量的可行手段。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f30d/3217973/f7f467a65cae/pone.0027567.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验