Suppr超能文献

用于有效抗菌活性分类的肽的自动表示。

An automatic representation of peptides for effective antimicrobial activity classification.

作者信息

Beltran Jesus A, Del Rio Gabriel, Brizuela Carlos A

机构信息

Computer Science Department, Cicese Research Center, Ensenada, Baja California 22860, Mexico.

Department of Biochemistry and Structural Biology, Instituto de Fisiologia Celular, Universidad Nacional Autónoma de México, 04510, Mexico.

出版信息

Comput Struct Biotechnol J. 2020 Feb 26;18:455-463. doi: 10.1016/j.csbj.2020.02.002. eCollection 2020.

Abstract

Antimicrobial peptides (AMPs) are a promising alternative to small-molecules-based antibiotics. These peptides are part of most living organisms' innate defense system. In order to computationally identify new AMPs within the peptides these organisms produce, an automatic AMP/non-AMP classifier is required. In order to have an efficient classifier, a set of robust features that can capture what differentiates an AMP from another that is not, has to be selected. However, the number of candidate descriptors is large (in the order of thousands) to allow for an exhaustive search of all possible combinations. Therefore, efficient and effective feature selection techniques are required. In this work, we propose an efficient wrapper technique to solve the feature selection problem for AMPs identification. The method is based on a Genetic Algorithm that uses a variable-length chromosome for representing the selected features and uses an objective function that considers the Mathew Correlation Coefficient and the number of selected features. Computational experiments show that the proposed method can produce competitive results regarding sensitivity, specificity, and MCC. Furthermore, the best classification results are achieved by using only 39 out of 272 molecular descriptors.

摘要

抗菌肽(AMPs)是基于小分子的抗生素的一种有前景的替代品。这些肽是大多数生物体先天防御系统的一部分。为了通过计算识别这些生物体产生的肽中的新抗菌肽,需要一个自动的抗菌肽/非抗菌肽分类器。为了拥有一个高效的分类器,必须选择一组能够捕捉抗菌肽与非抗菌肽差异的强大特征。然而,候选描述符的数量很大(数以千计),无法对所有可能的组合进行详尽搜索。因此,需要高效且有效的特征选择技术。在这项工作中,我们提出了一种高效的包装技术来解决抗菌肽识别中的特征选择问题。该方法基于遗传算法,使用可变长度染色体来表示所选特征,并使用一个考虑马修相关系数和所选特征数量的目标函数。计算实验表明该方法在敏感性、特异性和马修相关系数方面能产生有竞争力的结果。此外,仅使用272个分子描述符中的39个就能取得最佳分类结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe32/7063200/ac942a2074f2/ga1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验