Suppr超能文献

使用分类器集成从文本中识别蛋白质/基因名称。

Recognition of protein/gene names from text using an ensemble of classifiers.

作者信息

Zhou GuoDong, Shen Dan, Zhang Jie, Su Jian, Tan SoonHeng

机构信息

Institute for Infocomm Research, 21 Heng Mui Keng Terrace, 119613, Singapore.

出版信息

BMC Bioinformatics. 2005;6 Suppl 1(Suppl 1):S7. doi: 10.1186/1471-2105-6-S1-S7. Epub 2005 May 24.

Abstract

This paper proposes an ensemble of classifiers for biomedical name recognition in which three classifiers, one Support Vector Machine and two discriminative Hidden Markov Models, are combined effectively using a simple majority voting strategy. In addition, we incorporate three post-processing modules, including an abbreviation resolution module, a protein/gene name refinement module and a simple dictionary matching module, into the system to further improve the performance. Evaluation shows that our system achieves the best performance from among 10 systems with a balanced F-measure of 82.58 on the closed evaluation of the BioCreative protein/gene name recognition task (Task 1A).

摘要

本文提出了一种用于生物医学命名实体识别的分类器集成方法,其中三个分类器,一个支持向量机和两个判别式隐马尔可夫模型,采用简单多数投票策略有效组合。此外,我们将三个后处理模块,包括一个缩写解析模块、一个蛋白质/基因名称细化模块和一个简单的字典匹配模块,纳入系统以进一步提高性能。评估表明,在BioCreative蛋白质/基因名称识别任务(任务1A)的封闭评估中,我们的系统在10个系统中取得了最佳性能,平衡F值为82.58。

相似文献

1
Recognition of protein/gene names from text using an ensemble of classifiers.使用分类器集成从文本中识别蛋白质/基因名称。
BMC Bioinformatics. 2005;6 Suppl 1(Suppl 1):S7. doi: 10.1186/1471-2105-6-S1-S7. Epub 2005 May 24.
2
Exploring the boundaries: gene and protein identification in biomedical text.探索边界:生物医学文本中的基因与蛋白质识别
BMC Bioinformatics. 2005;6 Suppl 1(Suppl 1):S5. doi: 10.1186/1471-2105-6-S1-S5. Epub 2005 May 24.
3
Boosting performance of gene mention tagging system by hybrid methods.通过混合方法提高基因提及标记系统的性能。
J Biomed Inform. 2012 Feb;45(1):156-64. doi: 10.1016/j.jbi.2011.10.004. Epub 2011 Oct 28.
4
Systematic feature evaluation for gene name recognition.基因名称识别的系统特征评估
BMC Bioinformatics. 2005;6 Suppl 1(Suppl 1):S9. doi: 10.1186/1471-2105-6-S1-S9. Epub 2005 May 24.
6
Recognizing names in biomedical texts: a machine learning approach.识别生物医学文本中的名称:一种机器学习方法。
Bioinformatics. 2004 May 1;20(7):1178-90. doi: 10.1093/bioinformatics/bth060. Epub 2004 Feb 10.
9
BioCreAtIvE task 1A: gene mention finding evaluation.生物创意任务1A:基因提及发现评估。
BMC Bioinformatics. 2005;6 Suppl 1(Suppl 1):S2. doi: 10.1186/1471-2105-6-S1-S2. Epub 2005 May 24.

引用本文的文献

5
Mining chemical patents with an ensemble of open systems.利用开放系统集成挖掘化学专利。
Database (Oxford). 2016 May 12;2016. doi: 10.1093/database/baw065. Print 2016.
7
Breast cancer prediction using genome wide single nucleotide polymorphism data.利用全基因组单核苷酸多态性数据预测乳腺癌。
BMC Bioinformatics. 2013;14 Suppl 13(Suppl 13):S3. doi: 10.1186/1471-2105-14-S13-S3. Epub 2013 Oct 1.
9
Decomposing phenotype descriptions for the human skeletal phenome.分解人类骨骼表型组的表型描述
Biomed Inform Insights. 2013;6:1-14. doi: 10.4137/BII.S10729. Epub 2013 Feb 4.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验