Suppr超能文献

用于大型数据库虚拟筛选的递归中位数划分

Recursive median partitioning for virtual screening of large databases.

作者信息

Godden Jeffrey W, Furr John R, Bajorath Jürgen

机构信息

Department of Computer-Aided Drug Discovery, Albany Molecular Research, Inc. (AMRI), 21 Corporate Circle, Albany, New York 12212-5098, USA.

出版信息

J Chem Inf Comput Sci. 2003 Jan-Feb;43(1):182-8. doi: 10.1021/ci0203848.

Abstract

Recently, we have introduced the median partitioning (MP) method for diversity selection and compound classification. The MP approach utilizes property descriptors with continuous value ranges, transforms these descriptors into a binary classification scheme by determining their medians in source databases, and divides database molecules in subsequent steps into populations above or below these medians. Having previously demonstrated the usefulness of MP for the classification of molecules according to biological activity, we have now gone a step further and extended the methodology for application in virtual screening. In these calculations, a series of bait molecules having desired activity is added to large compound databases, and subsequent iterations or recursions are carried out to reduce the number of candidate molecules until a small number of compounds are found in partitions enriched with bait molecules. For each recursion step, descriptor combinations are identified that copartition as many active molecules as possible. Descriptor selection is facilitated by application of a genetic algorithm (GA). The recursive MP approach (RMP) has been applied to five diverse biological activity classes in virtual screening of a database consisting of approximately 1.34 million molecules to which different types of active compounds were added. RMP analysis produced hit rates of up to 21%, dependent on the biological activity class, and led to an average approximately 3600-fold improvement over random selection for the activity classes that were used as test cases.

摘要

最近,我们引入了中位数划分(MP)方法用于多样性选择和化合物分类。MP方法利用具有连续值域的性质描述符,通过在源数据库中确定这些描述符的中位数将其转化为二元分类方案,并在后续步骤中将数据库分子划分为高于或低于这些中位数的群体。我们之前已经证明了MP在根据生物活性对分子进行分类方面的有用性,现在我们更进一步,扩展了该方法以应用于虚拟筛选。在这些计算中,将一系列具有所需活性的诱饵分子添加到大型化合物数据库中,然后进行后续的迭代或递归操作以减少候选分子的数量,直到在富含诱饵分子的分区中找到少量化合物。对于每个递归步骤,确定能够将尽可能多的活性分子共划分的描述符组合。遗传算法(GA)的应用有助于描述符的选择。递归MP方法(RMP)已应用于一个包含约134万个分子的数据库的虚拟筛选中的五个不同生物活性类别,该数据库中添加了不同类型的活性化合物。RMP分析产生的命中率高达21%,这取决于生物活性类别,并且对于用作测试案例的活性类别,与随机选择相比平均提高了约3600倍。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验