• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过基因本体论对PubMed摘要进行语义链接和浏览。

Semantically linking and browsing PubMed abstracts with gene ontology.

作者信息

Vanteru Bhanu C, Shaik Jahangheer S, Yeasin Mohammed

机构信息

Electrical and Computer Engineering Department, University of Memphis, Memphis, Tennessee, USA.

出版信息

BMC Genomics. 2008;9 Suppl 1(Suppl 1):S10. doi: 10.1186/1471-2164-9-S1-S10.

DOI:10.1186/1471-2164-9-S1-S10
PMID:18366599
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2386052/
Abstract

BACKGROUND

The technological advances in the past decade have lead to massive progress in the field of biotechnology. The documentation of the progress made exists in the form of research articles. The PubMed is the current most used repository for bio-literature. PubMed consists of about 17 million abstracts as of 2007 that require methods to efficiently retrieve and browse large volume of relevant information. The State-of-the-art technologies such as GOPubmed use simple keyword-based techniques for retrieving abstracts from the PubMed and linking them to the Gene Ontology (GO). This paper changes the paradigm by introducing semantics enabled technique to link the PubMed to the Gene Ontology, called, SEGOPubmed for ontology-based browsing. Latent Semantic Analysis (LSA) framework is used to semantically interface PubMed abstracts to the Gene Ontology.

RESULTS

The Empirical analysis is performed to compare the performance of the SEGOPubmed with the GOPubmed. The analysis is initially performed using a few well-referenced query words. Further, statistical analysis is performed using GO curated dataset as ground truth. The analysis suggests that the SEGOPubmed performs better than the classic GOPubmed as it incorporates semantics.

CONCLUSIONS

The LSA technique is applied on the PubMed abstracts obtained based on the user query and the semantic similarity between the query and the abstracts. The analyses using well-referenced keywords show that the proposed semantic-sensitive technique outperformed the string comparison based techniques in associating the relevant abstracts to the GO terms. The SEGOPubmed also extracted the abstracts in which the keywords do not appear in isolation (i.e. they appear in combination with other terms) that could not be retrieved by simple term matching techniques.

摘要

背景

过去十年的技术进步推动了生物技术领域的巨大进展。这些进展的记录以研究文章的形式存在。PubMed是当前最常用的生物文献库。截至2007年,PubMed包含约1700万篇摘要,需要有效的方法来高效检索和浏览大量相关信息。诸如GOPubmed等先进技术使用基于简单关键词的技术从PubMed中检索摘要并将它们与基因本体(GO)相链接。本文引入了支持语义的技术将PubMed与基因本体相链接,即基于本体浏览的SEGOPubmed,从而改变了这一模式。潜在语义分析(LSA)框架用于将PubMed摘要与基因本体进行语义对接。

结果

进行实证分析以比较SEGOPubmed和GOPubmed的性能。分析最初使用一些引用广泛的查询词进行。此外,使用经GO策划的数据集作为基本事实进行统计分析。分析表明,SEGOPubmed由于纳入了语义,其性能优于经典的GOPubmed。

结论

LSA技术应用于基于用户查询获得的PubMed摘要以及查询与摘要之间的语义相似性。使用引用广泛的关键词进行的分析表明,所提出的语义敏感技术在将相关摘要与GO术语相关联方面优于基于字符串比较的技术。SEGOPubmed还提取了那些关键词不是单独出现(即它们与其他术语组合出现)的摘要,而这些摘要无法通过简单的词匹配技术检索到。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d528/2386052/bef940fb65a3/1471-2164-9-S1-S10-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d528/2386052/5bd65bfd4bc2/1471-2164-9-S1-S10-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d528/2386052/864badd6066c/1471-2164-9-S1-S10-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d528/2386052/ffd0ef1d667c/1471-2164-9-S1-S10-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d528/2386052/bef940fb65a3/1471-2164-9-S1-S10-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d528/2386052/5bd65bfd4bc2/1471-2164-9-S1-S10-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d528/2386052/864badd6066c/1471-2164-9-S1-S10-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d528/2386052/ffd0ef1d667c/1471-2164-9-S1-S10-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d528/2386052/bef940fb65a3/1471-2164-9-S1-S10-4.jpg

相似文献

1
Semantically linking and browsing PubMed abstracts with gene ontology.通过基因本体论对PubMed摘要进行语义链接和浏览。
BMC Genomics. 2008;9 Suppl 1(Suppl 1):S10. doi: 10.1186/1471-2164-9-S1-S10.
2
GoPubMed: exploring PubMed with the Gene Ontology.GoPubMed:利用基因本体论探索PubMed
Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W783-6. doi: 10.1093/nar/gki470.
3
Textpresso: an ontology-based information retrieval and extraction system for biological literature.Textpresso:一个基于本体的生物文献信息检索与提取系统。
PLoS Biol. 2004 Nov;2(11):e309. doi: 10.1371/journal.pbio.0020309. Epub 2004 Sep 21.
4
User centered and ontology based information retrieval system for life sciences.面向生命科学的以用户为中心和基于本体的信息检索系统。
BMC Bioinformatics. 2012 Jan 25;13 Suppl 1(Suppl 1):S4. doi: 10.1186/1471-2105-13-S1-S4.
5
LAILAPS-QSM: A RESTful API and JAVA library for semantic query suggestions.Lailaps-QSM:用于语义查询建议的 RESTful API 和 JAVA 库。
PLoS Comput Biol. 2018 Mar 12;14(3):e1006058. doi: 10.1371/journal.pcbi.1006058. eCollection 2018 Mar.
6
Discovering biomedical semantic relations in PubMed queries for information retrieval and database curation.在PubMed查询中发现生物医学语义关系以进行信息检索和数据库管理。
Database (Oxford). 2016 Mar 25;2016. doi: 10.1093/database/baw025. Print 2016.
7
PSE: a tool for browsing a large amount of MEDLINE/PubMed abstracts with gene names and common words as the keywords.PSE:一种以基因名称和常用词作为关键词来浏览大量MEDLINE/PubMed摘要的工具。
BMC Bioinformatics. 2005 Dec 10;6:295. doi: 10.1186/1471-2105-6-295.
8
DynGO: a tool for visualizing and mining of Gene Ontology and its associations.DynGO:一种用于可视化和挖掘基因本体及其关联的工具。
BMC Bioinformatics. 2005 Aug 9;6:201. doi: 10.1186/1471-2105-6-201.
9
GO2PUB: Querying PubMed with semantic expansion of gene ontology terms.GO2PUB:利用基因本体术语的语义扩展查询PubMed
J Biomed Semantics. 2012 Sep 7;3(1):7. doi: 10.1186/2041-1480-3-7.
10
Text-mining of PubMed abstracts by natural language processing to create a public knowledge base on molecular mechanisms of bacterial enteropathogens.通过自然语言处理对PubMed摘要进行文本挖掘,以创建关于细菌性肠道病原体分子机制的公共知识库。
BMC Bioinformatics. 2009 Jun 10;10:177. doi: 10.1186/1471-2105-10-177.

引用本文的文献

1
Feature extraction for phenotyping from semantic and knowledge resources.从语义和知识资源中进行表型特征提取。
J Biomed Inform. 2019 Mar;91:103122. doi: 10.1016/j.jbi.2019.103122. Epub 2019 Feb 7.
2
The proportion of cancer-related entries in PubMed has increased considerably; is cancer truly "The Emperor of All Maladies"?在PubMed中,与癌症相关条目的比例已大幅增加;癌症真的是“众病之王”吗?
PLoS One. 2017 Mar 10;12(3):e0173671. doi: 10.1371/journal.pone.0173671. eCollection 2017.
3
Recent highlights of Chinese medicine for advanced lung cancer.

本文引用的文献

1
Significance of gene ranking for classification of microarray samples.基因排序在微阵列样本分类中的意义。
IEEE/ACM Trans Comput Biol Bioinform. 2006 Jul-Sep;3(3):312-20. doi: 10.1109/TCBB.2006.42.
2
AliBaba: PubMed as a graph.阿里巴巴:作为图的PubMed。
Bioinformatics. 2006 Oct 1;22(19):2444-5. doi: 10.1093/bioinformatics/btl408. Epub 2006 Jul 26.
3
GoPubMed: exploring PubMed with the Gene Ontology.GoPubMed:利用基因本体论探索PubMed
中医治疗晚期肺癌的近期研究要点。
Chin J Integr Med. 2017 May;23(5):323-330. doi: 10.1007/s11655-016-2736-2. Epub 2016 Dec 27.
4
Characterizing the sublanguage of online breast cancer forums for medications, symptoms, and emotions.描述在线乳腺癌论坛中关于药物、症状和情绪的子语言。
AMIA Annu Symp Proc. 2014 Nov 14;2014:516-25. eCollection 2014.
5
Automated semantic annotation of rare disease cases: a case study.罕见病病例的自动语义标注:一项案例研究。
Database (Oxford). 2014 Jun 4;2014. doi: 10.1093/database/bau045. Print 2014.
6
Neural correlates of the relationship between discourse coherence and sensory monitoring in schizophrenia.精神分裂症中语篇连贯性与感觉监测之间关系的神经关联
Cortex. 2014 Jun;55:77-87. doi: 10.1016/j.cortex.2013.06.011. Epub 2013 Jul 22.
7
GO2PUB: Querying PubMed with semantic expansion of gene ontology terms.GO2PUB:利用基因本体术语的语义扩展查询PubMed
J Biomed Semantics. 2012 Sep 7;3(1):7. doi: 10.1186/2041-1480-3-7.
8
Ontology-based Brucella vaccine literature indexing and systematic analysis of gene-vaccine association network.基于本体论的布鲁氏菌疫苗文献标引及基因-疫苗关联网络的系统分析。
BMC Immunol. 2011 Aug 26;12:49. doi: 10.1186/1471-2172-12-49.
9
Clustering more than two million biomedical publications: comparing the accuracies of nine text-based similarity approaches.对两百多万篇生物医学文献进行聚类:比较九种基于文本的相似度方法的准确性。
PLoS One. 2011 Mar 17;6(3):e18029. doi: 10.1371/journal.pone.0018029.
10
Data integration for dynamic and sustainable systems biology resources: challenges and lessons learned.动态可持续系统生物学资源的数据集成:挑战与经验教训。
Chem Biodivers. 2010 May;7(5):1124-41. doi: 10.1002/cbdv.200900317.
Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W783-6. doi: 10.1093/nar/gki470.
4
PubFinder: a tool for improving retrieval rate of relevant PubMed abstracts.PubFinder:一种提高相关PubMed摘要检索率的工具。
Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W774-8. doi: 10.1093/nar/gki429.
5
Content-rich biological network constructed by mining PubMed abstracts.通过挖掘PubMed摘要构建的内容丰富的生物网络。
BMC Bioinformatics. 2004 Oct 8;5:147. doi: 10.1186/1471-2105-5-147.
6
PubMed: bridging the information gap.PubMed:弥合信息差距。
CMAJ. 2001 May 1;164(9):1317-9.
7
Mining literature for protein-protein interactions.挖掘文献以获取蛋白质-蛋白质相互作用信息。
Bioinformatics. 2001 Apr;17(4):359-63. doi: 10.1093/bioinformatics/17.4.359.
8
Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.基因本体论:生物学统一工具。基因本体论联合会。
Nat Genet. 2000 May;25(1):25-9. doi: 10.1038/75556.
9
MedMiner: an Internet text-mining tool for biomedical information, with application to gene expression profiling.MedMiner:一种用于生物医学信息的互联网文本挖掘工具,应用于基因表达谱分析。
Biotechniques. 1999 Dec;27(6):1210-4, 1216-7. doi: 10.2144/99276bc03.