Suppr超能文献

使用多方面相似性指标进行作者消歧。

Author disambiguation using multi-aspect similarity indicators.

作者信息

Gurney Thomas, Horlings Edwin, van den Besselaar Peter

出版信息

Scientometrics. 2012 May;91(2):435-449. doi: 10.1007/s11192-011-0589-1. Epub 2011 Dec 30.

Abstract

Key to accurate bibliometric analyses is the ability to correctly link individuals to their corpus of work, with an optimal balance between precision and recall. We have developed an algorithm that does this disambiguation task with a very high recall and precision. The method addresses the issues of discarded records due to null data fields and their resultant effect on recall, precision and F-measure results. We have implemented a dynamic approach to similarity calculations based on all available data fields. We have also included differences in author contribution and age difference between publications, both of which have meaningful effects on overall similarity measurements, resulting in significantly higher recall and precision of returned records. The results are presented from a test dataset of heterogeneous catalysis publications. Results demonstrate significantly high average F-measure scores and substantial improvements on previous and stand-alone techniques.

摘要

准确的文献计量分析的关键在于能够在精确性和召回率之间实现最佳平衡,将个人与其作品集正确关联起来。我们开发了一种算法,该算法在执行这种消歧任务时具有非常高的召回率和精确性。该方法解决了由于数据字段为空而导致记录被丢弃的问题,以及由此对召回率、精确性和F值结果产生的影响。我们基于所有可用数据字段实现了一种动态相似性计算方法。我们还纳入了作者贡献差异和出版物之间的年龄差异,这两者对整体相似性度量都有显著影响,从而使返回记录的召回率和精确性显著提高。结果来自异构催化出版物的测试数据集。结果表明平均F值得分显著提高,并且相对于之前的独立技术有实质性改进。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71a7/3319899/6b59f811735e/11192_2011_589_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验