Department of Systems and Computer Science, Howard University, 2400 Sixth Street, NW, Washington, DC 20059, USA.
Amino Acids. 2010 Aug;39(3):713-26. doi: 10.1007/s00726-010-0506-6. Epub 2010 Feb 18.
Protein domains are structural and fundamental functional units of proteins. The information of protein domain boundaries is helpful in understanding the evolution, structures and functions of proteins, and also plays an important role in protein classification. In this paper, we propose a support vector regression-based method to address the problem of protein domain boundary identification based on novel input profiles extracted from AAindex database. As a result, our method achieves an average sensitivity of approximately 36.5% and an average specificity of approximately 81% for multi-domain protein chains, which is overall better than the performance of published approaches to identify domain boundary. As our method used sequence information alone, our method is simpler and faster.
蛋白质结构域是蛋白质的结构和基本功能单位。蛋白质结构域边界的信息有助于理解蛋白质的进化、结构和功能,并且在蛋白质分类中也起着重要的作用。在本文中,我们提出了一种基于支持向量回归的方法,该方法基于从 AAindex 数据库中提取的新输入谱来解决蛋白质结构域边界识别问题。结果表明,对于多结构域蛋白质链,我们的方法的平均灵敏度约为 36.5%,平均特异性约为 81%,总体上优于已发表的识别结构域边界的方法。由于我们的方法仅使用序列信息,因此我们的方法更简单、更快。