Suppr超能文献

利用扩展的阈值邻接统计和支持向量机对蛋白质定位图像进行分类的高效计算模型。

Efficient computational model for classification of protein localization images using Extended Threshold Adjacency Statistics and Support Vector Machines.

机构信息

College of Computing and Informatics, Saudi Electronic University, Al-Madinah Branch, Saudi Arabia.

Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences, Islamabad, Pakistan; Department of Computer Science, National University of Computer and Emerging Sciences, Peshawar Campus, Pakistan.

出版信息

Comput Methods Programs Biomed. 2018 Apr;157:205-215. doi: 10.1016/j.cmpb.2018.01.021. Epub 2018 Feb 2.

Abstract

BACKGROUND AND OBJECTIVE

Discriminative and informative feature extraction is the core requirement for accurate and efficient classification of protein subcellular localization images so that drug development could be more effective. The objective of this paper is to propose a novel modification in the Threshold Adjacency Statistics technique and enhance its discriminative power.

METHODS

In this work, we utilized Threshold Adjacency Statistics from a novel perspective to enhance its discrimination power and efficiency. In this connection, we utilized seven threshold ranges to produce seven distinct feature spaces, which are then used to train seven SVMs. The final prediction is obtained through the majority voting scheme. The proposed ETAS-SubLoc system is tested on two benchmark datasets using 5-fold cross-validation technique.

RESULTS

We observed that our proposed novel utilization of TAS technique has improved the discriminative power of the classifier. The ETAS-SubLoc system has achieved 99.2% accuracy, 99.3% sensitivity and 99.1% specificity for Endogenous dataset outperforming the classical Threshold Adjacency Statistics technique. Similarly, 91.8% accuracy, 96.3% sensitivity and 91.6% specificity values are achieved for Transfected dataset.

CONCLUSIONS

Simulation results validated the effectiveness of ETAS-SubLoc that provides superior prediction performance compared to the existing technique. The proposed methodology aims at providing support to pharmaceutical industry as well as research community towards better drug designing and innovation in the fields of bioinformatics and computational biology. The implementation code for replicating the experiments presented in this paper is available at: https://drive.google.com/file/d/0B7IyGPObWbSqRTRMcXI2bG5CZWs/view?usp=sharing.

摘要

背景与目的

区分性和信息性特征提取是准确、高效地对蛋白质亚细胞定位图像进行分类的核心要求,以便更有效地开发药物。本文的目的是提出一种改进的 Threshold Adjacency Statistics 技术,并提高其判别能力。

方法

在这项工作中,我们从一个新的角度利用 Threshold Adjacency Statistics 来提高其判别能力和效率。在这方面,我们利用七个阈值范围生成七个不同的特征空间,然后利用这些特征空间来训练七个 SVM。最终的预测是通过多数投票方案得出的。我们使用 5 折交叉验证技术在两个基准数据集上测试了所提出的 ETAS-SubLoc 系统。

结果

我们观察到,我们对 TAS 技术的新应用提高了分类器的判别能力。ETAS-SubLoc 系统在 Endogenous 数据集上的准确率、敏感度和特异性分别达到了 99.2%、99.3%和 99.1%,优于经典的 Threshold Adjacency Statistics 技术。类似地,在 Transfected 数据集上,我们实现了 91.8%的准确率、96.3%的敏感度和 91.6%的特异性。

结论

模拟结果验证了 ETAS-SubLoc 的有效性,它提供了比现有技术更好的预测性能。所提出的方法旨在为制药行业以及生物信息学和计算生物学领域的研究社区提供支持,以实现更好的药物设计和创新。复制本文中实验的实现代码可在以下网址获取:https://drive.google.com/file/d/0B7IyGPObWbSqRTRMcXI2bG5CZWs/view?usp=sharing。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验