• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Improved recognition of splice sites in by incorporating secondary structure information into sequence-derived features: a computational study.通过将二级结构信息纳入序列衍生特征来提高对剪接位点的识别:一项计算研究。
3 Biotech. 2021 Nov;11(11):484. doi: 10.1007/s13205-021-03036-8. Epub 2021 Oct 31.
2
Evaluating the performance of sequence encoding schemes and machine learning methods for splice sites recognition.评估序列编码方案和机器学习方法在剪接位点识别中的性能。
Gene. 2019 Jul 15;705:113-126. doi: 10.1016/j.gene.2019.04.047. Epub 2019 Apr 19.
3
A computational approach for prediction of donor splice sites with improved accuracy.一种提高准确性的预测供体剪接位点的计算方法。
J Theor Biol. 2016 Sep 7;404:285-294. doi: 10.1016/j.jtbi.2016.06.013. Epub 2016 Jun 11.
4
Prediction of donor splice sites using random forest with a new sequence encoding approach.使用随机森林和一种新的序列编码方法预测供体剪接位点。
BioData Min. 2016 Jan 22;9:4. doi: 10.1186/s13040-016-0086-4. eCollection 2016.
5
EnsembleSplice: ensemble deep learning model for splice site prediction.EnsembleSplice:用于剪接位点预测的集成深度学习模型。
BMC Bioinformatics. 2022 Oct 6;23(1):413. doi: 10.1186/s12859-022-04971-w.
6
Splice site identification using probabilistic parameters and SVM classification.使用概率参数和支持向量机分类进行剪接位点识别。
BMC Bioinformatics. 2006 Dec 18;7 Suppl 5(Suppl 5):S15. doi: 10.1186/1471-2105-7-S5-S15.
7
UbNiRF: A Hybrid Framework Based on Null Importances and Random Forest that Combines Multiple Features to Predict Ubiquitination Sites in and .UbNiRF:一种基于空重要性和随机森林的混合框架,它结合多种特征来预测[具体内容缺失]中的泛素化位点。
Front Biosci (Landmark Ed). 2024 May 21;29(5):197. doi: 10.31083/j.fbl2905197.
8
GIpred: a computational tool for prediction of GIGANTEA proteins using machine learning algorithm.GIpred:一种使用机器学习算法预测巨蛋白的计算工具。
Physiol Mol Biol Plants. 2022 Jan;28(1):1-16. doi: 10.1007/s12298-022-01130-6. Epub 2022 Jan 24.
9
Feature subset selection for splice site prediction.用于剪接位点预测的特征子集选择
Bioinformatics. 2002;18 Suppl 2:S75-83. doi: 10.1093/bioinformatics/18.suppl_2.s75.
10
High-accuracy splice site prediction based on sequence component and position features.基于序列成分和位置特征的高精度剪接位点预测
Genet Mol Res. 2012 Sep 25;11(3):3432-51. doi: 10.4238/2012.September.25.12.

引用本文的文献

1
ASRmiRNA: Abiotic Stress-Responsive miRNA Prediction in Plants by Using Machine Learning Algorithms with Pseudo -Tuple Nucleotide Compositional Features.ASRmiRNA:基于伪元组核苷酸组成特征的机器学习算法预测植物非生物胁迫响应 miRNA
Int J Mol Sci. 2022 Jan 30;23(3):1612. doi: 10.3390/ijms23031612.

本文引用的文献

1
Splice2Deep: An ensemble of deep convolutional neural networks for improved splice site prediction in genomic DNA.Splice2Deep:用于改进基因组DNA中剪接位点预测的深度卷积神经网络集成方法。
Gene. 2020 Dec;763S:100035. doi: 10.1016/j.gene.2020.100035. Epub 2020 May 13.
2
SpliceFinder: ab initio prediction of splice sites using convolutional neural network.SpliceFinder:使用卷积神经网络进行剪接位点的从头预测。
BMC Bioinformatics. 2019 Dec 27;20(Suppl 23):652. doi: 10.1186/s12859-019-3306-3.
3
Evaluating the performance of sequence encoding schemes and machine learning methods for splice sites recognition.评估序列编码方案和机器学习方法在剪接位点识别中的性能。
Gene. 2019 Jul 15;705:113-126. doi: 10.1016/j.gene.2019.04.047. Epub 2019 Apr 19.
4
A computational approach for prediction of donor splice sites with improved accuracy.一种提高准确性的预测供体剪接位点的计算方法。
J Theor Biol. 2016 Sep 7;404:285-294. doi: 10.1016/j.jtbi.2016.06.013. Epub 2016 Jun 11.
5
Prediction of donor splice sites using random forest with a new sequence encoding approach.使用随机森林和一种新的序列编码方法预测供体剪接位点。
BioData Min. 2016 Jan 22;9:4. doi: 10.1186/s13040-016-0086-4. eCollection 2016.
6
Nucleotide sequence composition adjacent to intronic splice sites improves splicing efficiency via its effect on pre-mRNA local folding in fungi.内含子剪接位点附近的核苷酸序列组成通过影响真菌中前体mRNA的局部折叠来提高剪接效率。
RNA. 2015 Oct;21(10):1704-18. doi: 10.1261/rna.051268.115. Epub 2015 Aug 5.
7
Approaches to link RNA secondary structures with splicing regulation.将RNA二级结构与剪接调控联系起来的方法。
Methods Mol Biol. 2014;1126:341-56. doi: 10.1007/978-1-62703-980-2_25.
8
Immunoglobulin superfamily protein Dscam exhibited molecular diversity by alternative splicing in hemocytes of crustacean, Eriocheir sinensis.免疫球蛋白超家族蛋白 Dscam 通过选择性剪接在甲壳动物中华绒螯蟹的血细胞中表现出分子多样性。
Fish Shellfish Immunol. 2013 Sep;35(3):900-9. doi: 10.1016/j.fsi.2013.06.029. Epub 2013 Jul 13.
9
iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition.iRSpot-PseDNC:基于伪二核苷酸组成识别重组热点。
Nucleic Acids Res. 2013 Apr 1;41(6):e68. doi: 10.1093/nar/gks1450. Epub 2013 Jan 8.
10
High-accuracy splice site prediction based on sequence component and position features.基于序列成分和位置特征的高精度剪接位点预测
Genet Mol Res. 2012 Sep 25;11(3):3432-51. doi: 10.4238/2012.September.25.12.

通过将二级结构信息纳入序列衍生特征来提高对剪接位点的识别:一项计算研究。

Improved recognition of splice sites in by incorporating secondary structure information into sequence-derived features: a computational study.

作者信息

Meher Prabina Kumar, Satpathy Subhrajit

机构信息

ICAR-Indian Agricultural Statistics Research Institute, New Delhi, 110012 India.

出版信息

3 Biotech. 2021 Nov;11(11):484. doi: 10.1007/s13205-021-03036-8. Epub 2021 Oct 31.

DOI:10.1007/s13205-021-03036-8
PMID:34790508
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8558126/
Abstract

UNLABELLED

Identification of splice sites is an important aspect with regard to the prediction of gene structure. In most of the existing splice site prediction studies, machine learning algorithms coupled with sequence-derived features have been successfully employed for splice site recognition. However, the splice site identification by incorporating the secondary structure information is lacking, particularly in plant species. Thus, we made an attempt in this study to evaluate the performance of structural features on the splice site prediction accuracy in . Prediction accuracies were evaluated with the sequence-derived features alone as well as by incorporating the structural features into the sequence-derived features, where support vector machine (SVM) was employed as prediction algorithm. Both short (40 base pairs) and long (105 base pairs) sequence datasets were considered for evaluation. After incorporating the secondary structure features, improvements in accuracies were observed only for the longer sequence dataset and the improvement was found to be higher with the sequence-derived features that accounted nucleotide dependencies. On the other hand, either a little or no improvement in accuracies was found for the short sequence dataset. The performance of SVM was further compared with that of LogitBoost, Random Forest (RF), AdaBoost and XGBoost machine learning methods. The prediction accuracies of SVM, AdaBoost and XGBoost were observed to be at par and higher than that of RF and LogitBoost algorithms. While prediction was performed by taking all the sequence-derived features along with the structural features, a little improvement in accuracies was found as compared to the combination of individual sequence-based features and structural features. To the best of our knowledge, this is the first attempt concerning the computational prediction of splice sites using machine learning methods by incorporating the secondary structure information into the sequence-derived features. All the source codes are available at https://github.com/meher861982/SSFeature.

SUPPLEMENTARY INFORMATION

The online version contains supplementary material available at 10.1007/s13205-021-03036-8.

摘要

未标注

剪接位点的识别是基因结构预测的一个重要方面。在大多数现有的剪接位点预测研究中,机器学习算法与序列衍生特征相结合已成功用于剪接位点识别。然而,通过纳入二级结构信息进行剪接位点识别的研究较少,尤其是在植物物种中。因此,我们在本研究中尝试评估结构特征对剪接位点预测准确性的影响。分别使用仅基于序列衍生特征以及将结构特征纳入序列衍生特征的方法来评估预测准确性,其中支持向量机(SVM)被用作预测算法。评估时考虑了短(40个碱基对)和长(105个碱基对)序列数据集。纳入二级结构特征后,仅在较长序列数据集上观察到准确性有所提高,并且发现对于考虑核苷酸依赖性的序列衍生特征,提高幅度更大。另一方面,短序列数据集的准确性几乎没有提高或没有提高。还将支持向量机的性能与LogitBoost、随机森林(RF)、AdaBoost和XGBoost机器学习方法进行了比较。观察到支持向量机、AdaBoost和XGBoost的预测准确性相当且高于RF和LogitBoost算法。当结合所有序列衍生特征和结构特征进行预测时,与基于单个序列特征和结构特征的组合相比,准确性略有提高。据我们所知,这是首次尝试通过将二级结构信息纳入序列衍生特征,使用机器学习方法对剪接位点进行计算预测。所有源代码可在https://github.com/meher861982/SSFeature获取。

补充信息

在线版本包含可在10.1007/s13205-021-03036-8获取的补充材料。