Suppr超能文献

利用本体指纹识别来消除生物医学文献中基因名称实体的歧义。

Using Ontology Fingerprints to disambiguate gene name entities in the biomedical literature.

作者信息

Chen Guocai, Zhao Jieyi, Cohen Trevor, Tao Cui, Sun Jingchun, Xu Hua, Bernstam Elmer V, Lawson Andrew, Zeng Jia, Johnson Amber M, Holla Vijaykumar, Bailey Ann M, Lara-Guerra Humberto, Litzenburger Beate, Meric-Bernstam Funda, Jim Zheng W

机构信息

Center for Computational Biomedicine, School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX 77030, USA, Department of Public Health Science, Medical University of South Carolina, 135 Cannon Street, Suite 303, Charleston, SC 29425, USA and Department of Investigational Cancer Therapeutics, Institute for Personalized Cancer Therapy, UT-MD Anderson Cancer Center, 1400 Holcombe Blvd., FC8.3044, Houston, TX 77030, USA.

Center for Computational Biomedicine, School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX 77030, USA, Department of Public Health Science, Medical University of South Carolina, 135 Cannon Street, Suite 303, Charleston, SC 29425, USA and Department of Investigational Cancer Therapeutics, Institute for Personalized Cancer Therapy, UT-MD Anderson Cancer Center, 1400 Holcombe Blvd., FC8.3044, Houston, TX 77030, USA

出版信息

Database (Oxford). 2015 Apr 8;2015:bav034. doi: 10.1093/database/bav034. Print 2015.

Abstract

Ambiguous gene names in the biomedical literature are a barrier to accurate information extraction. To overcome this hurdle, we generated Ontology Fingerprints for selected genes that are relevant for personalized cancer therapy. These Ontology Fingerprints were used to evaluate the association between genes and biomedical literature to disambiguate gene names. We obtained 93.6% precision for the test gene set and 80.4% for the area under a receiver-operating characteristics curve for gene and article association. The core algorithm was implemented using a graphics processing unit-based MapReduce framework to handle big data and to improve performance. We conclude that Ontology Fingerprints can help disambiguate gene names mentioned in text and analyse the association between genes and articles. Database URL: http://www.ontologyfingerprint.org

摘要

生物医学文献中模糊的基因名称是准确信息提取的障碍。为克服这一障碍,我们为与个性化癌症治疗相关的选定基因生成了本体指纹。这些本体指纹用于评估基因与生物医学文献之间的关联,以消除基因名称的歧义。对于测试基因集,我们获得了93.6%的精确率,对于基因与文章关联的受试者工作特征曲线下面积,精确率为80.4%。核心算法是使用基于图形处理单元的MapReduce框架实现的,以处理大数据并提高性能。我们得出结论,本体指纹有助于消除文本中提及的基因名称的歧义,并分析基因与文章之间的关联。数据库网址:http://www.ontologyfingerprint.org

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e2a1/4390608/868f189a0b4c/bav034f1p.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验