Suppr超能文献

利用与B因子相关的特征,对蛋白质结合界面和晶体堆积接触进行准确分类。

Use B-factor related features for accurate classification between protein binding interfaces and crystal packing contacts.

作者信息

Liu Qian, Li Zhenhua, Li Jinyan

出版信息

BMC Bioinformatics. 2014;15 Suppl 16(Suppl 16):S3. doi: 10.1186/1471-2105-15-S16-S3. Epub 2014 Dec 8.

Abstract

BACKGROUND

Distinction between true protein interactions and crystal packing contacts is important for structural bioinformatics studies to respond to the need of accurate classification of the rapidly increasing protein structures. There are many unannotated crystal contacts and there also exist false annotations in this rapidly expanding volume of data. Previous tools have been proposed to address this problem. However, challenging issues still remain, such as low performance when the training and test data contain mixed interfaces having diverse sizes of contact areas.

METHODS AND RESULTS

B factor is a measure to quantify the vibrational motion of an atom, a more relevant feature than interface size to characterize protein binding. We propose to use three features related to B factor for the classification between biological interfaces and crystal packing contacts. The first feature is the sum of the normalized B factors of the interfacial atoms in the contact area, the second is the average of the interfacial B factor per residue in the chain, and the third is the average number of interfacial atoms with a negative normalized B factor per residue in the chain. We investigate the distribution properties of these basic features and a compound feature on four datasets of biological binding and crystal packing, and on a protein binding-only dataset with known binding affinity. We also compare the cross-dataset classification performance of these features with existing methods and with a widely-used and the most effective feature interface area. The results demonstrate that our features outperform the interface area approach and the existing prediction methods remarkably for many tests on all of these datasets.

CONCLUSIONS

The proposed B factor related features are more effective than interface area to distinguish crystal packing from biological binding interfaces. Our computational methods have a potential for large-scale and accurate identification of biological interactions from the experimentally determined structural data stored at PDB which may have diverse interface sizes.

摘要

背景

区分真正的蛋白质相互作用和晶体堆积接触对于结构生物信息学研究很重要,以满足对快速增加的蛋白质结构进行准确分类的需求。在这一快速增长的数据量中,存在许多未注释的晶体接触,也存在错误注释。之前已提出一些工具来解决这个问题。然而,仍存在具有挑战性的问题,例如当训练和测试数据包含具有不同接触面积大小的混合界面时性能较低。

方法与结果

B因子是量化原子振动运动的一种度量,是比界面大小更相关的用于表征蛋白质结合的特征。我们提出使用与B因子相关的三个特征来区分生物界面和晶体堆积接触。第一个特征是接触区域中界面原子的归一化B因子之和,第二个特征是链中每个残基的界面B因子的平均值,第三个特征是链中每个残基具有负归一化B因子的界面原子的平均数量。我们在四个生物结合和晶体堆积数据集以及一个具有已知结合亲和力的仅蛋白质结合数据集上研究了这些基本特征和一个复合特征的分布特性。我们还将这些特征的跨数据集分类性能与现有方法以及广泛使用且最有效的特征界面面积进行了比较。结果表明,在所有这些数据集上的许多测试中,我们的特征显著优于界面面积方法和现有的预测方法。

结论

所提出的与B因子相关的特征在区分晶体堆积和生物结合界面方面比界面面积更有效。我们的计算方法有潜力从存储在PDB中的实验确定的结构数据中大规模且准确地识别生物相互作用,这些数据可能具有不同的界面大小。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f58a/4290652/3e5f1d6010e1/1471-2105-15-S16-S3-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验