医学主题词表现状：通过学习排序实现PubMed规模的自动医学主题词表索引编制。

MeSH Now: automatic MeSH indexing at PubMed scale via learning to rank.

作者信息

Mao Yuqing, Lu Zhiyong

机构信息

Nanjing University of Chinese Medicine, 138 Xianlin Avenue, Nanjing, Jiangsu, 210023, China.

National Center for Biotechnology Information (NCBI), 8600 Rockville Pike, Bethesda, MD, 20894, USA.

出版信息

J Biomed Semantics. 2017 Apr 17;8(1):15. doi: 10.1186/s13326-017-0123-3.

DOI:10.1186/s13326-017-0123-3

PMID:28412964

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5392968/

Abstract

BACKGROUND

MeSH indexing is the task of assigning relevant MeSH terms based on a manual reading of scholarly publications by human indexers. The task is highly important for improving literature retrieval and many other scientific investigations in biomedical research. Unfortunately, given its manual nature, the process of MeSH indexing is both time-consuming (new articles are not immediately indexed until 2 or 3 months later) and costly (approximately ten dollars per article). In response, automatic indexing by computers has been previously proposed and attempted but remains challenging. In order to advance the state of the art in automatic MeSH indexing, a community-wide shared task called BioASQ was recently organized.

METHODS

We propose MeSH Now, an integrated approach that first uses multiple strategies to generate a combined list of candidate MeSH terms for a target article. Through a novel learning-to-rank framework, MeSH Now then ranks the list of candidate terms based on their relevance to the target article. Finally, MeSH Now selects the highest-ranked MeSH terms via a post-processing module.

RESULTS

We assessed MeSH Now on two separate benchmarking datasets using traditional precision, recall and F-score metrics. In both evaluations, MeSH Now consistently achieved over 0.60 in F-score, ranging from 0.610 to 0.612. Furthermore, additional experiments show that MeSH Now can be optimized by parallel computing in order to process MEDLINE documents on a large scale.

CONCLUSIONS

We conclude that MeSH Now is a robust approach with state-of-the-art performance for automatic MeSH indexing and that MeSH Now is capable of processing PubMed scale documents within a reasonable time frame.

AVAILABILITY

http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/MeSHNow/ .

摘要

背景

医学主题词（MeSH）标引是人工标引员通过人工阅读学术出版物来分配相关MeSH词的任务。该任务对于改善生物医学研究中的文献检索及许多其他科学研究非常重要。不幸的是，鉴于其人工性质，MeSH标引过程既耗时（新文章在2至3个月后才会被立即标引）又昂贵（每篇文章约10美元）。作为回应，此前已提出并尝试通过计算机进行自动标引，但仍具有挑战性。为了推动自动MeSH标引技术的发展，最近组织了一项名为BioASQ的全社区共享任务。

方法

我们提出了MeSH Now，这是一种综合方法，首先使用多种策略为目标文章生成候选MeSH词的组合列表。然后，通过一个新颖的排序学习框架，MeSH Now根据候选词与目标文章的相关性对其列表进行排序。最后，MeSH Now通过后处理模块选择排名最高的MeSH词。

结果

我们使用传统的精确率、召回率和F值指标在两个单独的基准数据集上评估了MeSH Now。在这两项评估中，MeSH Now的F值始终超过0.60，范围从0.610到0.612。此外，额外的实验表明，MeSH Now可以通过并行计算进行优化，以便大规模处理MEDLINE文档。

结论

我们得出结论，MeSH Now是一种强大的自动MeSH标引方法，具有先进的性能，并且MeSH Now能够在合理的时间范围内处理PubMed规模的文档。

可用性

http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/MeSHNow/ 。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

医学主题词表现状：通过学习排序实现PubMed规模的自动医学主题词表索引编制。

MeSH Now: automatic MeSH indexing at PubMed scale via learning to rank.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

AVAILABILITY

背景

方法

结果

结论

可用性

相似文献

引用本文的文献

本文引用的文献

医学主题词表现状：通过学习排序实现PubMed规模的自动医学主题词表索引编制。

MeSH Now: automatic MeSH indexing at PubMed scale via learning to rank.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

AVAILABILITY

背景

方法

结果

结论

可用性

相似文献

引用本文的文献

本文引用的文献