Suppr超能文献

对PubChem生物测定记录进行数据挖掘,发现多种氧化磷酸化抑制化学类型可作为抗卵巢癌的潜在治疗药物。

Data mining of PubChem bioassay records reveals diverse OXPHOS inhibitory chemotypes as potential therapeutic agents against ovarian cancer.

作者信息

Sharma Sejal, Feng Liping, Boonpattrawong Nicha, Kapur Arvinder, Barroilhet Lisa, Patankar Manish S, Ericksen Spencer S

机构信息

University of Wisconsin-Madison, Department of Obstetrics and Gynecology, Madison, WI, 53705, USA.

Department of Obstetrics and Gynecology, Qilu Hospital of Shandong University, Jinan, Shandong, 250012, People's Republic of China.

出版信息

J Cheminform. 2024 Oct 7;16(1):112. doi: 10.1186/s13321-024-00906-0.

Abstract

Focused screening on target-prioritized compound sets can be an efficient alternative to high throughput screening (HTS). For most biomolecular targets, compound prioritization models depend on prior screening data or a target structure. For phenotypic or multi-protein pathway targets, it may not be clear which public assay records provide relevant data. The question also arises as to whether data collected from disparate assays might be usefully consolidated. Here, we report on the development and application of a data mining pipeline to examine these issues. To illustrate, we focus on identifying inhibitors of oxidative phosphorylation, a druggable metabolic process in epithelial ovarian tumors. The pipeline compiled 8415 available OXPHOS-related bioassays in the PubChem data repository involving 312,093 unique compound records. Application of PubChem assay activity annotations, PAINS (Pan Assay Interference Compounds), and Lipinski-like bioavailability filters yields 1852 putative OXPHOS-active compounds that fall into 464 clusters. These chemotypes are diverse but have relatively high hydrophobicity and molecular weight but lower complexity and drug-likeness. These chemotypes show a high abundance of bicyclic ring systems and oxygen containing functional groups including ketones, allylic oxides (alpha/beta unsaturated carbonyls), hydroxyl groups, and ethers. In contrast, amide and primary amine functional groups have a notably lower than random prevalence. UMAP representation of the chemical space shows strong divergence in the regions occupied by OXPHOS-inactive and -active compounds. Of the six compounds selected for biological testing, 4 showed statistically significant inhibition of electron transport in bioenergetics assays. Two of these four compounds, lacidipine and esbiothrin, increased in intracellular oxygen radicals (a major hallmark of most OXPHOS inhibitors) and decreased the viability of two ovarian cancer cell lines, ID8 and OVCAR5. Finally, data from the pipeline were used to train random forest and support vector classifiers that effectively prioritized OXPHOS inhibitory compounds within a held-out test set (ROCAUC 0.962 and 0.927, respectively) and on another set containing 44 documented OXPHOS inhibitors outside of the training set (ROCAUC 0.900 and 0.823). This prototype pipeline is extensible and could be adapted for focus screening on other phenotypic targets for which sufficient public data are available.Scientific contributionHere, we describe and apply an assay data mining pipeline to compile, process, filter, and mine public bioassay data. We believe the procedure may be more broadly applied to guide compound selection in early-stage hit finding on novel multi-protein mechanistic or phenotypic targets. To demonstrate the utility of our approach, we apply a data mining strategy on a large set of public assay data to find drug-like molecules that inhibit oxidative phosphorylation (OXPHOS) as candidates for ovarian cancer therapies.

摘要

对目标优先排序的化合物集进行聚焦筛选可能是高通量筛选(HTS)的一种有效替代方法。对于大多数生物分子靶点,化合物优先排序模型依赖于先前的筛选数据或靶点结构。对于表型或多蛋白途径靶点,可能不清楚哪些公共检测记录提供了相关数据。还出现了一个问题,即从不同检测中收集的数据是否可以有效地整合。在这里,我们报告了一个数据挖掘流程的开发和应用,以研究这些问题。为了说明这一点,我们专注于识别氧化磷酸化的抑制剂,氧化磷酸化是上皮性卵巢肿瘤中一个可成药的代谢过程。该流程在PubChem数据存储库中汇编了8415个可用的与氧化磷酸化相关的生物检测,涉及312,093个独特的化合物记录。应用PubChem检测活性注释、PAINS(泛检测干扰化合物)和类Lipinski生物利用度过滤器,产生了1852个推定的氧化磷酸化活性化合物,这些化合物分为464个簇。这些化学类型多种多样,但具有相对较高的疏水性和分子量,但复杂性和类药性较低。这些化学类型显示出大量的双环系统和含氧官能团,包括酮、烯丙基氧化物(α/β不饱和羰基)、羟基和醚。相比之下,酰胺和伯胺官能团的出现频率明显低于随机水平。化学空间的UMAP表示显示,氧化磷酸化无活性和有活性化合物所占据的区域存在强烈差异。在选择进行生物学测试的六种化合物中,有4种在生物能量学检测中显示出对电子传递的统计学显著抑制。这四种化合物中的两种,拉西地平(lacidipine)和七氟菊酯(esbiothrin),增加了细胞内氧自由基(大多数氧化磷酸化抑制剂的一个主要标志),并降低了两种卵巢癌细胞系ID8和OVCAR5的活力。最后,该流程的数据用于训练随机森林和支持向量分类器,这些分类器有效地在一个留出的测试集中对氧化磷酸化抑制化合物进行了优先排序(ROCAUC分别为0.962和0.927),并在另一个包含44种训练集外已记录的氧化磷酸化抑制剂的集合上进行了优先排序(ROCAUC分别为0.900和0.823)。这个原型流程是可扩展的,可以适用于对有足够公共数据的其他表型靶点进行聚焦筛选。

科学贡献

在这里,我们描述并应用了一个检测数据挖掘流程来汇编、处理、过滤和挖掘公共生物检测数据。我们相信这个程序可能更广泛地应用于在针对新型多蛋白机制或表型靶点的早期命中发现中指导化合物选择。为了证明我们方法的实用性,我们对大量公共检测数据应用了一种数据挖掘策略,以找到抑制氧化磷酸化(OXPHOS)的类药物分子作为卵巢癌治疗的候选药物。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba34/11460086/deb79b59ba70/13321_2024_906_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验