Suppr超能文献

利用大量专利进行药物发现:广泛使用匹配和编辑操作的一般策略。

Drug discovery using very large numbers of patents: general strategy with extensive use of match and edit operations.

机构信息

St Matthews University School of Medicine, Grand Cayman, Cayman Islands, The University of Wisconsin-Stout, Menomonie, USA.

出版信息

J Comput Aided Mol Des. 2011 May;25(5):427-41. doi: 10.1007/s10822-011-9429-x. Epub 2011 May 3.

Abstract

A patent data base of 6.7 million compounds generated by a very high performance computer (Blue Gene) requires new techniques for exploitation when extensive use of chemical similarity is involved. Such exploitation includes the taxonomic classification of chemical themes, and data mining to assess mutual information between themes and companies. Importantly, we also launch candidates that evolve by "natural selection" as failure of partial match against the patent data base and their ability to bind to the protein target appropriately, by simulation on Blue Gene. An unusual feature of our method is that algorithms and workflows rely on dynamic interaction between match-and-edit instructions, which in practice are regular expressions. Similarity testing by these uses SMILES strings and, less frequently, graph or connectivity representations. Examining how this performs in high throughput, we note that chemical similarity and novelty are human concepts that largely have meaning by utility in specific contexts. For some purposes, mutual information involving chemical themes might be a better concept.

摘要

一个由高性能计算机(Blue Gene)生成的包含 670 万种化合物的专利数据库,在涉及广泛使用化学相似性时,需要新的技术来开发利用。这种开发利用包括化学主题的分类学分类,以及数据挖掘以评估主题和公司之间的互信息。重要的是,我们还通过在 Blue Gene 上的模拟,推出了通过“自然选择”进化的候选物,因为它们与专利数据库的部分匹配失败,以及它们与蛋白质靶标适当结合的能力。我们的方法的一个不寻常的特点是,算法和工作流程依赖于匹配和编辑指令之间的动态交互,这些指令在实践中是正则表达式。这些用法通过 SMILES 字符串进行相似性测试,并且不太频繁地使用图形或连通性表示。在考察这种方法在高通量中的表现时,我们注意到化学相似性和新颖性是人类概念,它们在特定上下文中的实用性方面具有很大的意义。对于某些目的而言,涉及化学主题的互信息可能是一个更好的概念。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验