• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用本体指纹识别来消除生物医学文献中基因名称实体的歧义。

Using Ontology Fingerprints to disambiguate gene name entities in the biomedical literature.

作者信息

Chen Guocai, Zhao Jieyi, Cohen Trevor, Tao Cui, Sun Jingchun, Xu Hua, Bernstam Elmer V, Lawson Andrew, Zeng Jia, Johnson Amber M, Holla Vijaykumar, Bailey Ann M, Lara-Guerra Humberto, Litzenburger Beate, Meric-Bernstam Funda, Jim Zheng W

机构信息

Center for Computational Biomedicine, School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX 77030, USA, Department of Public Health Science, Medical University of South Carolina, 135 Cannon Street, Suite 303, Charleston, SC 29425, USA and Department of Investigational Cancer Therapeutics, Institute for Personalized Cancer Therapy, UT-MD Anderson Cancer Center, 1400 Holcombe Blvd., FC8.3044, Houston, TX 77030, USA.

Center for Computational Biomedicine, School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX 77030, USA, Department of Public Health Science, Medical University of South Carolina, 135 Cannon Street, Suite 303, Charleston, SC 29425, USA and Department of Investigational Cancer Therapeutics, Institute for Personalized Cancer Therapy, UT-MD Anderson Cancer Center, 1400 Holcombe Blvd., FC8.3044, Houston, TX 77030, USA

出版信息

Database (Oxford). 2015 Apr 8;2015:bav034. doi: 10.1093/database/bav034. Print 2015.

DOI:10.1093/database/bav034
PMID:25858285
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4390608/
Abstract

Ambiguous gene names in the biomedical literature are a barrier to accurate information extraction. To overcome this hurdle, we generated Ontology Fingerprints for selected genes that are relevant for personalized cancer therapy. These Ontology Fingerprints were used to evaluate the association between genes and biomedical literature to disambiguate gene names. We obtained 93.6% precision for the test gene set and 80.4% for the area under a receiver-operating characteristics curve for gene and article association. The core algorithm was implemented using a graphics processing unit-based MapReduce framework to handle big data and to improve performance. We conclude that Ontology Fingerprints can help disambiguate gene names mentioned in text and analyse the association between genes and articles. Database URL: http://www.ontologyfingerprint.org

摘要

生物医学文献中模糊的基因名称是准确信息提取的障碍。为克服这一障碍,我们为与个性化癌症治疗相关的选定基因生成了本体指纹。这些本体指纹用于评估基因与生物医学文献之间的关联,以消除基因名称的歧义。对于测试基因集,我们获得了93.6%的精确率,对于基因与文章关联的受试者工作特征曲线下面积,精确率为80.4%。核心算法是使用基于图形处理单元的MapReduce框架实现的,以处理大数据并提高性能。我们得出结论,本体指纹有助于消除文本中提及的基因名称的歧义,并分析基因与文章之间的关联。数据库网址:http://www.ontologyfingerprint.org

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e2a1/4390608/4438da3e71dd/bav034f7p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e2a1/4390608/868f189a0b4c/bav034f1p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e2a1/4390608/08dcd2b13f13/bav034f2p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e2a1/4390608/67d667b55dbd/bav034f3p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e2a1/4390608/7d41fd866299/bav034f4p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e2a1/4390608/a51323fd3398/bav034f6p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e2a1/4390608/4438da3e71dd/bav034f7p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e2a1/4390608/868f189a0b4c/bav034f1p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e2a1/4390608/08dcd2b13f13/bav034f2p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e2a1/4390608/67d667b55dbd/bav034f3p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e2a1/4390608/7d41fd866299/bav034f4p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e2a1/4390608/a51323fd3398/bav034f6p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e2a1/4390608/4438da3e71dd/bav034f7p.jpg

相似文献

1
Using Ontology Fingerprints to disambiguate gene name entities in the biomedical literature.利用本体指纹识别来消除生物医学文献中基因名称实体的歧义。
Database (Oxford). 2015 Apr 8;2015:bav034. doi: 10.1093/database/bav034. Print 2015.
2
OntoMate: a text-mining tool aiding curation at the Rat Genome Database.OntoMate:一种辅助大鼠基因组数据库编目的文本挖掘工具。
Database (Oxford). 2015 Jan 25;2015. doi: 10.1093/database/bau129. Print 2015.
3
Utilization of ontology look-up services in information retrieval for biomedical literature.本体查找服务在生物医学文献信息检索中的应用
Stud Health Technol Inform. 2013;186:155-9.
4
Analysis of biological processes and diseases using text mining approaches.使用文本挖掘方法分析生物过程和疾病。
Methods Mol Biol. 2010;593:341-82. doi: 10.1007/978-1-60327-194-3_16.
5
Automated curation of gene name normalization results using the Konstanz information miner.使用康斯坦茨信息挖掘器对基因名称标准化结果进行自动管理。
J Biomed Inform. 2015 Feb;53:58-64. doi: 10.1016/j.jbi.2014.08.016. Epub 2014 Sep 10.
6
tagtog: interactive and text-mining-assisted annotation of gene mentions in PLOS full-text articles.tagtog:在 PLoS 全文文章中进行基因提及的交互式和文本挖掘辅助注释。
Database (Oxford). 2014 Apr 7;2014(0):bau033. doi: 10.1093/database/bau033. Print 2014.
7
Building a protein name dictionary from full text: a machine learning term extraction approach.从全文构建蛋白质名称词典:一种机器学习术语提取方法。
BMC Bioinformatics. 2005 Apr 7;6:88. doi: 10.1186/1471-2105-6-88.
8
Terminological resources for text mining over biomedical scientific literature.生物医学文献文本挖掘的术语资源。
Artif Intell Med. 2011 Jun;52(2):107-14. doi: 10.1016/j.artmed.2011.04.011. Epub 2011 Jun 11.
9
NetiNeti: discovery of scientific names from text using machine learning methods.内提内提:使用机器学习方法从文本中发现科学名称。
BMC Bioinformatics. 2012 Aug 22;13:211. doi: 10.1186/1471-2105-13-211.
10
Extracting Cross-Ontology Weighted Association Rules from Gene Ontology Annotations.从基因本体注释中提取跨本体加权关联规则
IEEE/ACM Trans Comput Biol Bioinform. 2016 Mar-Apr;13(2):197-208. doi: 10.1109/TCBB.2015.2462348.

引用本文的文献

1
A knowledge empowered explainable gene ontology fingerprint approach to improve gene functional explication and prediction.一种知识赋能的可解释基因本体指纹方法,用于改进基因功能阐释和预测。
iScience. 2023 Mar 7;26(4):106356. doi: 10.1016/j.isci.2023.106356. eCollection 2023 Apr 21.
2
Evidence for craniofacial enhancer variation underlying nonsyndromic cleft lip and palate.颅面增强子变异与非综合征性唇腭裂相关的证据。
Hum Genet. 2020 Oct;139(10):1261-1272. doi: 10.1007/s00439-020-02169-9. Epub 2020 Apr 21.
3
Gene fingerprint model for literature based detection of the associations among complex diseases: a case study of COPD.

本文引用的文献

1
Finding pathway-modulating genes from a novel Ontology Fingerprint-derived gene network.从一个新的本体指纹衍生基因网络中寻找通路调节基因。
Nucleic Acids Res. 2014 Oct;42(18):e138. doi: 10.1093/nar/gku678. Epub 2014 Jul 24.
2
PIK3CA and AKT1 mutations have distinct effects on sensitivity to targeted pathway inhibitors in an isogenic luminal breast cancer model system.PIK3CA 和 AKT1 突变对同源性腔乳腺癌模型系统中靶向通路抑制剂的敏感性有不同的影响。
Clin Cancer Res. 2013 Oct 1;19(19):5413-22. doi: 10.1158/1078-0432.CCR-13-0884. Epub 2013 Jul 25.
3
Signaling network prediction by the Ontology Fingerprint enhanced Bayesian network.
基于文献的复杂疾病关联检测的基因指纹模型:以 COPD 为例。
BMC Med Inform Decis Mak. 2019 Jan 31;19(Suppl 1):20. doi: 10.1186/s12911-019-0738-7.
4
Restructured GEO: restructuring Gene Expression Omnibus metadata for genome dynamics analysis.重构 GEO:用于基因组动态分析的基因表达综合(GEO)元数据重构。
Database (Oxford). 2019 Jan 1;2019:bay145. doi: 10.1093/database/bay145.
5
Identifying term relations cross different gene ontology categories.跨不同基因本体论类别识别术语关系。
BMC Bioinformatics. 2017 Dec 28;18(Suppl 16):573. doi: 10.1186/s12859-017-1959-3.
基于本体指纹增强贝叶斯网络的信号网络预测
BMC Syst Biol. 2012;6 Suppl 3(Suppl 3):S3. doi: 10.1186/1752-0509-6-S3-S3. Epub 2012 Dec 17.
4
The GNAT library for local and remote gene mention normalization.GNAT 库,用于本地和远程基因提及标准化。
Bioinformatics. 2011 Oct 1;27(19):2769-71. doi: 10.1093/bioinformatics/btr455. Epub 2011 Aug 3.
5
Evaluation of genome-wide association study results through development of ontology fingerprints.通过本体指纹图谱的开发对全基因组关联研究结果进行评估。
Bioinformatics. 2009 May 15;25(10):1314-20. doi: 10.1093/bioinformatics/btp158. Epub 2009 Apr 5.
6
Inter-species normalization of gene mentions with GNAT.使用GNAT对基因提及进行种间标准化。
Bioinformatics. 2008 Aug 15;24(16):i126-132. doi: 10.1093/bioinformatics/btn299.
7
Protein tyrosine kinase 2beta as a candidate gene for hypertension.蛋白酪氨酸激酶2β作为高血压的候选基因。
Pharmacogenet Genomics. 2007 Nov;17(11):931-9. doi: 10.1097/FPC.0b013e3282ef640e.
8
Gene symbol disambiguation using knowledge-based profiles.使用基于知识的概况进行基因符号消歧。
Bioinformatics. 2007 Apr 15;23(8):1015-22. doi: 10.1093/bioinformatics/btm056. Epub 2007 Feb 21.
9
Gene name ambiguity of eukaryotic nomenclatures.真核生物命名法中的基因名称歧义。
Bioinformatics. 2005 Jan 15;21(2):248-56. doi: 10.1093/bioinformatics/bth496. Epub 2004 Aug 27.
10
Personalized cancer therapy--the key to the future.个性化癌症治疗——未来的关键。
Pharmacogenomics. 2004 Apr;5(3):225-8. doi: 10.1517/phgs.5.3.225.29829.