Suppr超能文献

基于三字母词的特征提取技术,利用位置特定评分矩阵的线性概率进行蛋白质折叠识别。

A tri-gram based feature extraction technique using linear probabilities of position specific scoring matrix for protein fold recognition.

出版信息

IEEE Trans Nanobioscience. 2014 Mar;13(1):44-50. doi: 10.1109/TNB.2013.2296050.

Abstract

In biological sciences, the deciphering of a three dimensional structure of a protein sequence is considered to be an important and challenging task. The identification of protein folds from primary protein sequences is an intermediate step in discovering the three dimensional structure of a protein. This can be done by utilizing feature extraction technique to accurately extract all the relevant information followed by employing a suitable classifier to label an unknown protein. In the past, several feature extraction techniques have been developed but with limited recognition accuracy only. In this study, we have developed a feature extraction technique based on tri-grams computed directly from Position Specific Scoring Matrices. The effectiveness of the feature extraction technique has been shown on two benchmark datasets. The proposed technique exhibits up to 4.4% improvement in protein fold recognition accuracy compared to the state-of-the-art feature extraction techniques.

摘要

在生物科学领域,破译蛋白质序列的三维结构被认为是一项重要且具有挑战性的任务。从原始蛋白质序列中识别蛋白质折叠是发现蛋白质三维结构的中间步骤。这可以通过利用特征提取技术来准确提取所有相关信息,然后使用合适的分类器来标记未知蛋白质来完成。过去已经开发了几种特征提取技术,但准确性有限。在这项研究中,我们开发了一种基于直接从位置特异性评分矩阵计算的三元组的特征提取技术。该特征提取技术在两个基准数据集上的有效性已经得到了证明。与最先进的特征提取技术相比,该技术在蛋白质折叠识别准确性方面提高了高达 4.4%。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验