• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于图折叠的药物分子检索方法

[A retrieval method of drug molecules based on graph collapsing].

作者信息

Qu J W, Lv X Q, Liu Z M, Liao Y, Sun P H, Wang B, Tang Z

机构信息

Institute of Computer Science & Technology, Peking University, Beijing 100080, China.

Institute of Computer Science & Technology, Peking University, Beijing 100080, China; State Key Laboratory of Digital Publishing Technology, Beijing 100080, China.

出版信息

Beijing Da Xue Xue Bao Yi Xue Ban. 2018 Apr 18;50(2):368-374.

PMID:29643542
Abstract

OBJECTIVE

To establish a compact and efficient hypergraph representation and a graph-similarity-based retrieval method of molecules to achieve effective and efficient medicine information retrieval.

METHODS

Chemical structural formula (CSF) was a primary search target as a unique and precise identifier for each compound at the molecular level in the research field of medicine information retrieval. To retrieve medicine information effectively and efficiently, a complete workflow of the graph-based CSF retrieval system was introduced. This system accepted the photos taken from smartphones and the sketches drawn on tablet personal computers as CSF inputs, and formalized the CSFs with the corresponding graphs. Then this paper proposed a compact and efficient hypergraph representation for molecules on the basis of analyzing factors that directly affected the efficiency of graph matching. According to the characteristics of CSFs, a hierarchical collapsing method combining graph isomorphism and frequent subgraph mining was adopted. There was yet a fundamental challenge, subgraph overlapping during the collapsing procedure, which hindered the method from establishing the correct compact hypergraph of an original CSF graph. Therefore, a graph-isomorphism-based algorithm was proposed to select dominant acyclic subgraphs on the basis of overlapping analysis. Finally, the spatial similarity among graphical CSFs was evaluated by multi-dimensional measures of similarity.

RESULTS

To evaluate the performance of the proposed method, the proposed system was firstly compared with Wikipedia Chemical Structure Explorer (WCSE), the state-of-the-art system that allowed CSF similarity searching within Wikipedia molecules dataset, on retrieval accuracy. The system achieved higher values on mean average precision, discounted cumulative gain, rank-biased precision, and expected reciprocal rank than WCSE from the top-2 to the top-10 retrieved results. Specifically, the system achieved 10%, 1.41, 6.42%, and 1.32% higher than WCSE on these metrics for top-10 retrieval results, respectively. Moreover, several retrieval cases were presented to intuitively compare with WCSE. The results of the above comparative study demonstrated that the proposed method outperformed the existing method with regard to accuracy and effectiveness.

CONCLUSION

This paper proposes a graph-similarity-based retrieval approach for medicine information. To obtain satisfactory retrieval results, an isomorphism-based algorithm is proposed for dominant subgraph selection based on the subgraph overlapping analysis, as well as an effective and efficient hypergraph representation of molecules. Experiment results demonstrate the effectiveness of the proposed approach.

摘要

目的

建立一种紧凑高效的超图表示法以及基于图相似性的分子检索方法,以实现有效且高效的医学信息检索。

方法

在医学信息检索研究领域,化学结构式(CSF)作为每个化合物在分子层面的唯一且精确的标识符,是主要的搜索目标。为了有效且高效地检索医学信息,引入了基于图的CSF检索系统的完整工作流程。该系统接受智能手机拍摄的照片和平板电脑绘制的草图作为CSF输入,并将CSF形式化为相应的图。然后,在分析直接影响图匹配效率的因素的基础上,本文提出了一种紧凑高效的分子超图表示法。根据CSF的特征,采用了一种结合图同构和频繁子图挖掘的分层折叠方法。然而,在折叠过程中存在一个基本挑战,即子图重叠,这阻碍了该方法建立原始CSF图的正确紧凑超图。因此,提出了一种基于图同构的算法,在重叠分析的基础上选择主导无环子图。最后,通过多维相似性度量评估图形CSF之间的空间相似性。

结果

为了评估所提方法的性能,首先将所提系统与维基百科化学结构浏览器(WCSE)进行比较,WCSE是在维基百科分子数据集中允许进行CSF相似性搜索的最先进系统,比较检索准确性。从检索结果的前2名到前10名,该系统在平均精度均值、折损累计增益、排序偏差精度和期望倒数排名方面的值均高于WCSE。具体而言,对于前10名检索结果,该系统在这些指标上分别比WCSE高10%、1.41、6.42%和1.32%。此外,还给出了几个检索案例,以便直观地与WCSE进行比较。上述比较研究结果表明,所提方法在准确性和有效性方面优于现有方法。

结论

本文提出了一种基于图相似性的医学信息检索方法。为了获得满意的检索结果,提出了一种基于同构的算法,用于基于子图重叠分析选择主导子图,以及一种有效且高效的分子超图表示法。实验结果证明了所提方法的有效性。

相似文献

1
[A retrieval method of drug molecules based on graph collapsing].基于图折叠的药物分子检索方法
Beijing Da Xue Xue Bao Yi Xue Ban. 2018 Apr 18;50(2):368-374.
2
Scaffold hopping using clique detection applied to reduced graphs.使用团检测应用于简化图的支架跳跃。
J Chem Inf Model. 2006 Mar-Apr;46(2):503-11. doi: 10.1021/ci050347r.
3
Representation and searching of carbohydrate structures using graph-theoretic techniques.使用图论技术表示和搜索碳水化合物结构。
Carbohydr Res. 1997 Oct 28;304(1):61-7. doi: 10.1016/s0008-6215(97)00196-1.
4
SING: subgraph search in non-homogeneous graphs.SING:非齐次图中的子图搜索。
BMC Bioinformatics. 2010 Feb 19;11:96. doi: 10.1186/1471-2105-11-96.
5
A graph-based approach for the retrieval of multi-modality medical images.基于图的多模态医学图像检索方法。
Med Image Anal. 2014 Feb;18(2):330-42. doi: 10.1016/j.media.2013.11.003. Epub 2013 Dec 6.
6
Coupling Graphs, Efficient Algorithms and B-Cell Epitope Prediction.耦合图、高效算法与B细胞表位预测
IEEE/ACM Trans Comput Biol Bioinform. 2014 Jan-Feb;11(1):7-16. doi: 10.1109/TCBB.2013.136.
7
Searching for pharmacophoric patterns in databases of three-dimensional chemical structures.在三维化学结构数据库中搜索药效团模式。
J Mol Recognit. 1995 Sep-Oct;8(5):290-303. doi: 10.1002/jmr.300080503.
8
Top-k similar graph matching using TraM in biological networks.使用 TraM 在生物网络中进行 top-k 相似图匹配。
IEEE/ACM Trans Comput Biol Bioinform. 2012 Nov-Dec;9(6):1790-804. doi: 10.1109/TCBB.2012.90.
9
Effectiveness of graph-based and fingerprint-based similarity measures for virtual screening of 2D chemical structure databases.基于图形和基于指纹的相似性度量在二维化学结构数据库虚拟筛选中的有效性。
J Comput Aided Mol Des. 2002 Jan;16(1):59-71. doi: 10.1023/a:1016387816342.
10
Similarity searching in files of three-dimensional chemical structures: representation and searching of molecular electrostatic potentials using field-graphs.三维化学结构文件中的相似性搜索:使用场图表示和搜索分子静电势
J Comput Aided Mol Des. 1997 Mar;11(2):163-74. doi: 10.1023/a:1008034527445.