• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

野外 InChI:对 Google 中 InChIKey 搜索的评估。

InChI in the wild: an assessment of InChIKey searching in Google.

机构信息

TW2Informatics, Göteborg 42166, Sweden.

出版信息

J Cheminform. 2013 Feb 11;5(1):10. doi: 10.1186/1758-2946-5-10.

DOI:10.1186/1758-2946-5-10
PMID:23399051
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3598674/
Abstract

While chemical databases can be queried using the InChI string and InChIKey (IK) the latter was designed for open-web searching. It is becoming increasingly effective for this since more sources enhance crawling of their websites by the Googlebot and consequent IK indexing. Searchers who use Google as an adjunct to database access may be less familiar with the advantages of using the IK as explored in this review. As an example, the IK for atorvastatin retrieves ~200 low-redundancy links from a Google search in 0.3 of a second. These include most major databases and a very low false-positive rate. Results encompass less familiar but potentially useful sources and can be extended to isomer capture by using just the skeleton layer of the IK. Google Advanced Search can be used to filter large result sets. Image searching with the IK is also effective and complementary to open-web queries. Results can be particularly useful for less-common structures as exemplified by a major metabolite of atorvastatin giving only three hits. Testing also demonstrated document-to-document and document-to-database joins via structure matching. The necessary generation of an IK from chemical names can be accomplished using open tools and resources for patents, papers, abstracts or other text sources. Active global sharing of local IK-linked information can be accomplished via surfacing in open laboratory notebooks, blogs, Twitter, figshare and other routes. While information-rich chemistry (e.g. approved drugs) can exhibit swamping and redundancy effects, the much smaller IK result sets for link-poor structures become a transformative first-pass option. The IK indexing has therefore turned Google into a de-facto open global chemical information hub by merging links to most significant sources, including over 50 million PubChem and ChemSpider records. The simplicity, specificity and speed of matching make it a useful option for biologists or others less familiar with chemical searching. However, compared to rigorously maintained major databases, users need to be circumspect about the consistency of Google results and provenance of retrieved links. In addition, community engagement may be necessary to ameliorate possible future degradation of utility.

摘要

虽然可以使用 InChI 字符串和 InChIKey(IK)查询化学数据库,但后者是专为开放网络搜索而设计的。由于越来越多的来源通过 Googlebot 增强了对其网站的爬行,并且随之进行了 IK 索引,因此这种方法变得越来越有效。在数据库访问中使用 Google 作为辅助工具的搜索者可能不太熟悉本综述中探讨的使用 IK 的优势。例如,阿托伐他汀的 IK 在 0.3 秒内从 Google 搜索中检索到大约 200 个低冗余链接。这些链接包括大多数主要数据库和非常低的假阳性率。结果包括不太知名但可能有用的来源,并且可以通过仅使用 IK 的骨架层来扩展到异构体捕获。可以使用 Google 高级搜索来过滤大型结果集。使用 IK 进行图像搜索也是有效的,并且可以与开放网络查询互补。结果对于不太常见的结构特别有用,阿托伐他汀的一种主要代谢物的示例仅产生三个命中。测试还证明了通过结构匹配进行文档到文档和文档到数据库的连接。可以使用专利、论文、摘要或其他文本来源的开放工具和资源从化学名称生成 IK。通过在开放实验室笔记本、博客、Twitter、figshare 和其他途径中显示,可实现本地 IK 链接信息的全球主动共享。虽然信息丰富的化学物质(例如已批准的药物)可能会出现淹没和冗余效应,但链接较少的结构的 IK 结果集成为变革性的首选方法。因此,通过将大多数重要来源(包括超过 5000 万 PubChem 和 ChemSpider 记录)的链接合并到 IK 索引中,Google 已成为事实上的开放全球化学信息中心。匹配的简单性、特异性和速度使其成为不太熟悉化学搜索的生物学家或其他人的有用选择。但是,与经过严格维护的主要数据库相比,用户需要谨慎对待 Google 结果的一致性和检索链接的出处。此外,可能需要社区参与来减轻未来可能出现的功能降级。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19d3/3598674/3c5fc7b8c0da/1758-2946-5-10-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19d3/3598674/a0b67f71ba2a/1758-2946-5-10-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19d3/3598674/037e0543cdb6/1758-2946-5-10-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19d3/3598674/fc9174f9ad2d/1758-2946-5-10-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19d3/3598674/3cb62c0491d9/1758-2946-5-10-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19d3/3598674/2504d8215244/1758-2946-5-10-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19d3/3598674/018f7b284711/1758-2946-5-10-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19d3/3598674/3c5fc7b8c0da/1758-2946-5-10-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19d3/3598674/a0b67f71ba2a/1758-2946-5-10-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19d3/3598674/037e0543cdb6/1758-2946-5-10-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19d3/3598674/fc9174f9ad2d/1758-2946-5-10-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19d3/3598674/3cb62c0491d9/1758-2946-5-10-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19d3/3598674/2504d8215244/1758-2946-5-10-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19d3/3598674/018f7b284711/1758-2946-5-10-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19d3/3598674/3c5fc7b8c0da/1758-2946-5-10-7.jpg

相似文献

1
InChI in the wild: an assessment of InChIKey searching in Google.野外 InChI:对 Google 中 InChIKey 搜索的评估。
J Cheminform. 2013 Feb 11;5(1):10. doi: 10.1186/1758-2946-5-10.
2
Extracting and connecting chemical structures from text sources using chemicalize.org.使用 chemicalize.org 从文本来源中提取和连接化学结构。
J Cheminform. 2013 Apr 23;5(1):20. doi: 10.1186/1758-2946-5-20.
3
InChIKey collision resistance: an experimental testing.InChIKey 抗冲突性:实验测试。
J Cheminform. 2012 Dec 20;4(1):39. doi: 10.1186/1758-2946-4-39.
4
Beyond the black stump: rapid reviews of health research issues affecting regional, rural and remote Australia.超越黑木树:影响澳大利亚地区、农村和偏远地区的健康研究问题的快速综述。
Med J Aust. 2020 Dec;213 Suppl 11:S3-S32.e1. doi: 10.5694/mja2.50881.
5
IUPAC International Chemical Identifier (InChI)-related education and training materials through InChI Open Education Resource (OER).通过国际纯粹与应用化学联合会(IUPAC)国际化学标识符(InChI)开放教育资源(OER)提供的相关教育和培训材料。
Chem Teach Int. 2024 Jan 3;6(1):77-91. doi: 10.1515/cti-2023-0009. eCollection 2024 Mar.
6
Language preferences on websites and in Google searches for human health and food information.网站以及谷歌搜索中关于人类健康和食品信息的语言偏好。
J Med Internet Res. 2007 Jun 28;9(2):e18. doi: 10.2196/jmir.9.2.e18.
7
Using Application Programming Interfaces to Access Google Data for Health Research: Protocol for a Methodological Framework.使用应用程序编程接口访问谷歌数据用于健康研究:方法框架协议
JMIR Res Protoc. 2020 Jul 6;9(7):e16543. doi: 10.2196/16543.
8
InChI version 1.06: now more than 99.99% reliable.国际化学标识符(InChI)版本1.06:目前可靠性超过99.99%。
J Cheminform. 2021 May 24;13(1):40. doi: 10.1186/s13321-021-00517-z.
9
Opening up connectivity between documents, structures and bioactivity.开启文档、结构与生物活性之间的连通性。
Beilstein J Org Chem. 2020 Apr 2;16:596-606. doi: 10.3762/bjoc.16.54. eCollection 2020.
10
The Natural Products Atlas: An Open Access Knowledge Base for Microbial Natural Products Discovery.《天然产物图谱:微生物天然产物发现的开放获取知识库》
ACS Cent Sci. 2019 Nov 27;5(11):1824-1833. doi: 10.1021/acscentsci.9b00806. Epub 2019 Nov 14.

引用本文的文献

1
TM-MC 2.0: an enhanced chemical database of medicinal materials in Northeast Asian traditional medicine.TM-MC 2.0:东北亚传统医药药用材料增强型化学数据库。
BMC Complement Med Ther. 2024 Jan 16;24(1):40. doi: 10.1186/s12906-023-04331-y.
2
Complementary Dual Approach for In Silico Target Identification of Potential Pharmaceutical Compounds in Cystic Fibrosis.互补双途径法在囊性纤维化潜在药物化合物的计算机目标识别中的应用
Int J Mol Sci. 2022 Oct 15;23(20):12351. doi: 10.3390/ijms232012351.
3
PeakForest: a multi-platform digital infrastructure for interoperable metabolite spectral data and metadata management.

本文引用的文献

1
Challenges and recommendations for obtaining chemical structures of industry-provided repurposing candidates.工业提供的再利用候选物的化学结构获取所面临的挑战和建议。
Drug Discov Today. 2013 Jan;18(1-2):58-70. doi: 10.1016/j.drudis.2012.11.005. Epub 2012 Nov 15.
2
Public domain databases for medicinal chemistry.药物化学的公共领域数据库。
J Med Chem. 2012 Aug 23;55(16):6987-7002. doi: 10.1021/jm300501t. Epub 2012 Jul 11.
3
Robust central reduction of amyloid-β in humans with an orally available, non-peptidic β-secretase inhibitor.
PeakForest:一个用于互操作代谢物光谱数据和元数据管理的多平台数字基础设施。
Metabolomics. 2022 Jun 14;18(6):40. doi: 10.1007/s11306-022-01899-3.
4
Comprehensive Analysis of Chemical Structures That Have Been Tested as CFTR Activating Substances in a Publicly Available Database CandActCFTR.在公开可用数据库CandActCFTR中对已作为CFTR激活物质进行测试的化学结构的综合分析。
Front Pharmacol. 2021 Dec 8;12:689205. doi: 10.3389/fphar.2021.689205. eCollection 2021.
5
Advances in decomposing complex metabolite mixtures using substructure- and network-based computational metabolomics approaches.利用基于子结构和网络的计算代谢组学方法分解复杂代谢物混合物的进展。
Nat Prod Rep. 2021 Nov 17;38(11):1967-1993. doi: 10.1039/d1np00023c.
6
Will the chemical probes please stand up?化学探针请站起来好吗?
RSC Med Chem. 2021 Jul 16;12(8):1428-1441. doi: 10.1039/d1md00138h. eCollection 2021 Aug 18.
7
Molecular representations in AI-driven drug discovery: a review and practical guide.人工智能驱动的药物发现中的分子表征:综述与实践指南
J Cheminform. 2020 Sep 17;12(1):56. doi: 10.1186/s13321-020-00460-5.
8
Can an InChI for Nano Address the Need for a Simplified Representation of Complex Nanomaterials across Experimental and Nanoinformatics Studies?纳米材料的国际化学标识符(InChI)能否满足在实验研究和纳米信息学研究中对复杂纳米材料进行简化表示的需求?
Nanomaterials (Basel). 2020 Dec 11;10(12):2493. doi: 10.3390/nano10122493.
9
Opening up connectivity between documents, structures and bioactivity.开启文档、结构与生物活性之间的连通性。
Beilstein J Org Chem. 2020 Apr 2;16:596-606. doi: 10.3762/bjoc.16.54. eCollection 2020.
10
Caveat Usor: Assessing Differences between Major Chemistry Databases.警示用户:评估主要化学数据库之间的差异。
ChemMedChem. 2018 Mar 20;13(6):470-481. doi: 10.1002/cmdc.201700724. Epub 2018 Feb 23.
口服型、非肽类β-分泌酶抑制剂可实现人脑中淀粉样β的稳健中枢减少。
J Neurosci. 2011 Nov 16;31(46):16507-16. doi: 10.1523/JNEUROSCI.3647-11.2011.
4
Making every SAR point count: the development of Chemistry Connect for the large-scale integration of structure and bioactivity data.充分利用每一个 SAR 点:为大规模整合结构和生物活性数据而开发的 Chemistry Connect。
Drug Discov Today. 2011 Dec;16(23-24):1019-30. doi: 10.1016/j.drudis.2011.10.005. Epub 2011 Oct 14.
5
Minimum information about a bioactive entity (MIABE).最小化生物活性实体信息(MIABE)。
Nat Rev Drug Discov. 2011 Aug 31;10(9):661-9. doi: 10.1038/nrd3503.
6
Chemical name to structure: OPSIN, an open source solution.化学名到结构:视蛋白,一个开源解决方案。
J Chem Inf Model. 2011 Mar 28;51(3):739-53. doi: 10.1021/ci100384d. Epub 2011 Mar 9.
7
Clinical pharmacokinetics of atorvastatin.阿托伐他汀的临床药代动力学
Clin Pharmacokinet. 2003;42(13):1141-60. doi: 10.2165/00003088-200342130-00005.
8
High quality visualization of biochemical pathways in BioPath.BioPath中生化途径的高质量可视化。
In Silico Biol. 2002;2(2):59-73.