Suppr超能文献

COC DA——一种使用C距离矩阵在蛋白质中进行原子间接触检测的快速且可扩展的算法。

COC DA - a fast and scalable algorithm for interatomic contact detection in proteins using C distance matrices.

作者信息

Lemos Rafael Pereira, Mariano Diego, Silveira Sabrina De Azevedo, de Melo-Minardi Raquel C

机构信息

Laboratory of Bioinformatics and Systems, Department of Computer Science, Federal University of Minas Gerais, Belo Horizonte, Brazil.

Laboratory of Bioinformatics, Visualization and Systems, Department of Informatics, Federal University of Viçosa, Viçosa, Brazil.

出版信息

Front Bioinform. 2025 Sep 1;5:1630078. doi: 10.3389/fbinf.2025.1630078. eCollection 2025.

Abstract

Protein interatomic contacts, defined by spatial proximity and physicochemical complementarity at atomic resolution, are fundamental to characterizing molecular interactions and bonding. Methods for calculating contacts are generally categorized as cutoff-dependent, which rely on Euclidean distances, or cutoff-independent, which utilize Delaunay and Voronoi tessellations. While cutoff-dependent methods are recognized for their simplicity, completeness, and reliability, traditional implementations remain computationally expensive, posing significant scalability challenges in the current Big Data era of bioinformatics. Here, we introduce COC DA (COntact search pruning by C Distance Analysis), a Python-based command-line tool for improving search pruning in large-scale interatomic protein contact analysis using alpha-carbon (C ) distance matrices. COC DA detects intra- and inter-chain contacts, and classifies them into seven different types: hydrogen and disulfide bonds; hydrophobic effects; attractive, repulsive, and salt-bridge interactions; and aromatic stackings. To evaluate our tool, we compared it with three traditional approaches in the literature: all-against-all atom distance calculation ("brute-force"), static C distance cutoff (SC), and Biopython's NeighborSearch class (NS). COC DA demonstrated superior performance compared to the other methods, achieving on average 6x faster computation times than advanced data structures like -d trees from NS, in addition to being simpler to implement and fully customizable. The presented tool facilitates exploratory and large-scale analyses of interatomic contacts in proteins in a simple and efficient manner, also enabling the integration of results with other tools and pipelines. The COC DA tool is freely available at https://github.com/LBS-UFMG/COCaDA.

摘要

蛋白质原子间接触由原子分辨率下的空间接近度和物理化学互补性定义,是表征分子相互作用和键合的基础。计算接触的方法通常分为依赖截止值的方法(依赖欧几里得距离)和不依赖截止值的方法(利用德劳内三角剖分和沃罗诺伊镶嵌)。虽然依赖截止值的方法因其简单性、完整性和可靠性而得到认可,但传统实现方式在计算上仍然很昂贵,在当前生物信息学的大数据时代带来了重大的可扩展性挑战。在这里,我们介绍了COC DA(通过Cα距离分析进行接触搜索剪枝),这是一个基于Python的命令行工具,用于使用α-碳(Cα)距离矩阵改进大规模蛋白质原子间接触分析中的搜索剪枝。COC DA检测链内和链间接触,并将它们分为七种不同类型:氢键和二硫键;疏水作用;吸引、排斥和盐桥相互作用;以及芳香堆积。为了评估我们的工具,我们将其与文献中的三种传统方法进行了比较:全对全原子距离计算(“暴力法”)、静态Cα距离截止(SC)和Biopython的NeighborSearch类(NS)。与其他方法相比,COC DA表现出卓越的性能,与NS中的kd树等高级数据结构相比,平均计算速度快6倍,此外还更易于实现且完全可定制。所展示的工具以简单高效的方式促进了对蛋白质原子间接触的探索性和大规模分析,还能够将结果与其他工具和管道集成。COC DA工具可在https://github.com/LBS-UFMG/COCaDA上免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e91b/12433948/ff95854a94b4/fbinf-05-1630078-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验