Medvedev Kirill E, Schaeffer R Dustin, Grishin Nick V
Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA.
Present address: Department of Computer Science, University of Central Florida, Orlando, FL, 32816, USA.
bioRxiv. 2025 Jul 7:2025.07.03.663025. doi: 10.1101/2025.07.03.663025.
Proteins carry out essential cellular functions - signaling, metabolism, transport - through the specific interaction of small molecules and drugs within their three-dimensional structural domains. Protein domains are conserved folding units that, when combined, drive evolutionary progress. The Evolutionary Classification Of protein Domains (ECOD) places domains into a hierarchy explicitly built around distant evolutionary relationships, enabling the detection of remote homologs across the proteomes. Yet no single resource has systematically mapped domain-ligand interactions at the structural level. To fill this gap, we introduce DrugDomain v2.0, the first comprehensive database linking evolutionary domain classifications (ECOD) to ligand binding events across the entire Protein Data Bank. We also leverage AI-driven predictions from AlphaFold to extend domain-ligand annotations to human drug targets lacking experimental structures. DrugDomain v2.0 catalogs interactions with over 37,000 PDB ligands and 7,560 DrugBank molecules, integrates 6,000+ small-molecule-associated post-translational modifications, and provides context for 14,000+ PTM-modified human protein models featuring docked ligands. The database encompasses 43,023 unique UniProt accessions and 174,545 PDB structures. The DrugDomain data is available online: https://drugdomain.cs.ucf.edu/ and https://github.com/kirmedvedev/DrugDomain.
蛋白质通过小分子和药物在其三维结构域内的特异性相互作用来执行基本的细胞功能,如信号传导、新陈代谢和运输。蛋白质结构域是保守的折叠单元,它们组合在一起推动了进化进程。蛋白质结构域的进化分类(ECOD)将结构域置于一个明确围绕远缘进化关系构建的层次结构中,从而能够在整个蛋白质组中检测到远缘同源物。然而,还没有单一的资源系统地绘制出结构水平上的结构域-配体相互作用。为了填补这一空白,我们推出了DrugDomain v2.0,这是第一个将进化结构域分类(ECOD)与整个蛋白质数据库中的配体结合事件联系起来的综合数据库。我们还利用来自AlphaFold的人工智能驱动预测,将结构域-配体注释扩展到缺乏实验结构的人类药物靶点。DrugDomain v2.0编目了与超过37000个PDB配体和7560个DrugBank分子的相互作用,整合了6000多个与小分子相关的翻译后修饰,并为14000多个带有对接配体的PTM修饰的人类蛋白质模型提供了背景信息。该数据库包含43023个独特的UniProt登录号和174545个PDB结构。DrugDomain数据可在线获取:https://drugdomain.cs.ucf.edu/ 和https://github.com/kirmedvedev/DrugDomain 。