Department of Machine Learning, Moffitt Cancer Center, Tampa, Florida.
Department of Computational and Quantitative Medicine, City of Hope, Duarte, California.
Biophys J. 2024 Sep 3;123(17):2910-2920. doi: 10.1016/j.bpj.2024.05.017. Epub 2024 May 18.
Cyclin-dependent kinase 12 (CDK12) is a critical regulatory protein involved in transcription and DNA repair processes. Dysregulation of CDK12 has been implicated in various diseases, including cancer. Understanding the CDK12 interactome is pivotal for elucidating its functional roles and potential therapeutic targets. Traditional methods for interactome prediction often rely on protein structure information, limiting applicability to CDK12 characterized by partly disordered terminal C region. In this study, we present a structure-independent machine-learning model that utilizes proteins' sequence and functional data to predict the CDK12 interactome. This approach is motivated by the disordered character of the CDK12 C-terminal region mitigating a structure-driven search for binding partners. Our approach incorporates multiple data sources, including protein-protein interaction networks, functional annotations, and sequence-based features, to construct a comprehensive CDK12 interactome prediction model. The ability to predict CDK12 interactions without relying on structural information is a significant advancement, as many potential interaction partners may lack crystallographic data. In conclusion, our structure-independent machine-learning model presents a powerful tool for predicting the CDK12 interactome and holds promise in advancing our understanding of CDK12 biology, identifying potential therapeutic targets, and facilitating precision-medicine approaches for CDK12-associated diseases.
细胞周期蛋白依赖性激酶 12(CDK12)是一种参与转录和 DNA 修复过程的关键调节蛋白。CDK12 的失调与各种疾病有关,包括癌症。了解 CDK12 的相互作用组对于阐明其功能作用和潜在的治疗靶点至关重要。传统的相互作用组预测方法通常依赖于蛋白质结构信息,限制了其在部分无序末端 C 区的 CDK12 中的适用性。在这项研究中,我们提出了一种不依赖于结构的机器学习模型,该模型利用蛋白质的序列和功能数据来预测 CDK12 的相互作用组。这种方法的动机是 CDK12 C 末端区域的无序特征,减轻了对结合伴侣的结构驱动搜索。我们的方法结合了多个数据源,包括蛋白质-蛋白质相互作用网络、功能注释和基于序列的特征,以构建一个全面的 CDK12 相互作用组预测模型。无需依赖结构信息即可预测 CDK12 相互作用是一项重大进展,因为许多潜在的相互作用伙伴可能缺乏晶体学数据。总之,我们的不依赖于结构的机器学习模型为预测 CDK12 相互作用组提供了一种强大的工具,并有望促进我们对 CDK12 生物学的理解,确定潜在的治疗靶点,并为 CDK12 相关疾病提供精准医学方法。