Suppr超能文献

BRONCO:用于提取基因-变异-疾病-药物关系的生物医学实体关系肿瘤语料库。

BRONCO: Biomedical entity Relation ONcology COrpus for extracting gene-variant-disease-drug relations.

作者信息

Lee Kyubum, Lee Sunwon, Park Sungjoon, Kim Sunkyu, Kim Suhkyung, Choi Kwanghun, Tan Aik Choon, Kang Jaewoo

机构信息

Department of Computer Science and Engineering, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul, 02841 Korea and.

Translational Bioinformatics and Cancer Systems Biology Laboratory, Division of Medical Oncology, Department of Medicine, University of Colorado Anschutz Medical Campus, 12801 East 17th Avenue Aurora, CO 80045, USA

出版信息

Database (Oxford). 2016 Apr 13;2016. doi: 10.1093/database/baw043. Print 2016.

Abstract

Comprehensive knowledge of genomic variants in a biological context is key for precision medicine. As next-generation sequencing technologies improve, the amount of literature containing genomic variant data, such as new functions or related phenotypes, rapidly increases. Because numerous articles are published every day, it is almost impossible to manually curate all the variant information from the literature. Many researchers focus on creating an improved automated biomedical natural language processing (BioNLP) method that extracts useful variants and their functional information from the literature. However, there is no gold-standard data set that contains texts annotated with variants and their related functions. To overcome these limitations, we introduce a Biomedical entity Relation ONcology COrpus (BRONCO) that contains more than 400 variants and their relations with genes, diseases, drugs and cell lines in the context of cancer and anti-tumor drug screening research. The variants and their relations were manually extracted from 108 full-text articles. BRONCO can be utilized to evaluate and train new methods used for extracting biomedical entity relations from full-text publications, and thus be a valuable resource to the biomedical text mining research community. Using BRONCO, we quantitatively and qualitatively evaluated the performance of three state-of-the-art BioNLP methods. We also identified their shortcomings, and suggested remedies for each method. We implemented post-processing modules for the three BioNLP methods, which improved their performance.Database URL:http://infos.korea.ac.kr/bronco.

摘要

在生物学背景下全面了解基因组变异是精准医学的关键。随着下一代测序技术的改进,包含基因组变异数据(如新功能或相关表型)的文献数量迅速增加。由于每天都有大量文章发表,几乎不可能手动整理文献中所有的变异信息。许多研究人员专注于创建一种改进的自动化生物医学自然语言处理(BioNLP)方法,从文献中提取有用的变异及其功能信息。然而,没有一个包含用变异及其相关功能注释的文本的金标准数据集。为了克服这些限制,我们引入了一个生物医学实体关系肿瘤学语料库(BRONCO),该语料库在癌症和抗肿瘤药物筛选研究的背景下包含400多个变异及其与基因、疾病、药物和细胞系的关系。这些变异及其关系是从108篇全文文章中手动提取的。BRONCO可用于评估和训练用于从全文出版物中提取生物医学实体关系的新方法,因此是生物医学文本挖掘研究社区的宝贵资源。使用BRONCO,我们对三种最先进的BioNLP方法的性能进行了定量和定性评估。我们还确定了它们的缺点,并针对每种方法提出了补救措施。我们为这三种BioNLP方法实现了后处理模块,提高了它们的性能。数据库网址:http://infos.korea.ac.kr/bronco。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/01e8/4830473/c1c01571faab/baw043f1p.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验