Suppr超能文献

大蛋白质分子动力学研究中的时空多阶段共识聚类

Spatiotemporal multistage consensus clustering in molecular dynamics studies of large proteins.

作者信息

Kenn Michael, Ribarics Reiner, Ilieva Nevena, Cibena Michael, Karch Rudolf, Schreiner Wolfgang

机构信息

Section of Biosimulation and Bioinformatics, Center for Medical Statistics, Informatics and Intelligent Systems (CeMSIIS), Medical University of Vienna, Spitalgasse 23, A-1090 Vienna, Austria.

Institute of Information and Communication Technologies (IICT), Bulgarian Academy of Sciences, 25A, Acad. G. Bonchev Str., Sofia 1113, Bulgaria.

出版信息

Mol Biosyst. 2016 Apr 26;12(5):1600-14. doi: 10.1039/c5mb00879d.

Abstract

The aim of this work is to find semi-rigid domains within large proteins as reference structures for fitting molecular dynamics trajectories. We propose an algorithm, multistage consensus clustering, MCC, based on minimum variation of distances between pairs of Cα-atoms as target function. The whole dataset (trajectory) is split into sub-segments. For a given sub-segment, spatial clustering is repeatedly started from different random seeds, and we adopt the specific spatial clustering with minimum target function: the process described so far is stage 1 of MCC. Then, in stage 2, the results of spatial clustering are consolidated, to arrive at domains stable over the whole dataset. We found that MCC is robust regarding the choice of parameters and yields relevant information on functional domains of the major histocompatibility complex (MHC) studied in this paper: the α-helices and β-floor of the protein (MHC) proved to be most flexible and did not contribute to clusters of significant size. Three alleles of the MHC, each in complex with ABCD3 peptide and LC13 T-cell receptor (TCR), yielded different patterns of motion. Those alleles causing immunological allo-reactions showed distinct correlations of motion between parts of the peptide, the binding cleft and the complementary determining regions (CDR)-loops of the TCR. Multistage consensus clustering reflected functional differences between MHC alleles and yields a methodological basis to increase sensitivity of functional analyses of bio-molecules. Due to the generality of approach, MCC is prone to lend itself as a potent tool also for the analysis of other kinds of big data.

摘要

这项工作的目的是在大型蛋白质中找到半刚性结构域,作为拟合分子动力学轨迹的参考结构。我们提出了一种基于Cα原子对之间距离最小变化作为目标函数的算法——多阶段共识聚类(MCC)。整个数据集(轨迹)被分割成子片段。对于给定的子片段,从不同的随机种子反复启动空间聚类,我们采用目标函数最小的特定空间聚类:到目前为止所描述的过程是MCC的第1阶段。然后,在第2阶段,整合空间聚类的结果,以得到在整个数据集中稳定的结构域。我们发现MCC在参数选择方面具有鲁棒性,并能提供有关本文研究的主要组织相容性复合体(MHC)功能结构域的相关信息:蛋白质(MHC)的α螺旋和β底板被证明是最灵活的,并且对显著大小的聚类没有贡献。MHC的三个等位基因,每个都与ABCD3肽和LC13 T细胞受体(TCR)形成复合物,产生了不同的运动模式。那些引起免疫同种异体反应的等位基因在肽的部分、结合裂隙和TCR的互补决定区(CDR)环之间表现出明显的运动相关性。多阶段共识聚类反映了MHC等位基因之间的功能差异,并为提高生物分子功能分析的灵敏度提供了方法基础。由于方法的通用性,MCC也易于成为分析其他类型大数据的有力工具。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验