使用多方面相似性指标进行作者消歧。

Author disambiguation using multi-aspect similarity indicators.

作者信息

Gurney Thomas, Horlings Edwin, van den Besselaar Peter

出版信息

Scientometrics. 2012 May;91(2):435-449. doi: 10.1007/s11192-011-0589-1. Epub 2011 Dec 30.

DOI:10.1007/s11192-011-0589-1

PMID:22485059

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3319899/

Abstract

Key to accurate bibliometric analyses is the ability to correctly link individuals to their corpus of work, with an optimal balance between precision and recall. We have developed an algorithm that does this disambiguation task with a very high recall and precision. The method addresses the issues of discarded records due to null data fields and their resultant effect on recall, precision and F-measure results. We have implemented a dynamic approach to similarity calculations based on all available data fields. We have also included differences in author contribution and age difference between publications, both of which have meaningful effects on overall similarity measurements, resulting in significantly higher recall and precision of returned records. The results are presented from a test dataset of heterogeneous catalysis publications. Results demonstrate significantly high average F-measure scores and substantial improvements on previous and stand-alone techniques.

摘要

准确的文献计量分析的关键在于能够在精确性和召回率之间实现最佳平衡，将个人与其作品集正确关联起来。我们开发了一种算法，该算法在执行这种消歧任务时具有非常高的召回率和精确性。该方法解决了由于数据字段为空而导致记录被丢弃的问题，以及由此对召回率、精确性和F值结果产生的影响。我们基于所有可用数据字段实现了一种动态相似性计算方法。我们还纳入了作者贡献差异和出版物之间的年龄差异，这两者对整体相似性度量都有显著影响，从而使返回记录的召回率和精确性显著提高。结果来自异构催化出版物的测试数据集。结果表明平均F值得分显著提高，并且相对于之前的独立技术有实质性改进。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71a7/3319899/6b59f811735e/11192_2011_589_Fig1_HTML.jpg

相似文献

Author disambiguation using multi-aspect similarity indicators.使用多方面相似性指标进行作者消歧。

Scientometrics. 2012 May;91(2):435-449. doi: 10.1007/s11192-011-0589-1. Epub 2011 Dec 30.

Automatic vs. manual curation of a multi-source chemical dictionary: the impact on text mining.自动与手动编目多源化学词典：对文本挖掘的影响。

J Cheminform. 2010 Mar 23;2(1):3. doi: 10.1186/1758-2946-2-3.

Author Disambiguation in PubMed: Evidence on the Precision and Recall of Author-ity among NIH-Funded Scientists.PubMed 中的作者身份识别：国立卫生研究院资助科学家的权威性精确性与召回率证据

PLoS One. 2016 Jul 1;11(7):e0158731. doi: 10.1371/journal.pone.0158731. eCollection 2016.

'Seed + expand': a general methodology for detecting publication oeuvres of individual researchers.“种子+扩展”：一种检测个体研究人员发表作品全集的通用方法。

Scientometrics. 2014;101(2):1403-1417. doi: 10.1007/s11192-014-1256-0. Epub 2014 Mar 5.

A new approach and gold standard toward author disambiguation in MEDLINE.一种新的方法和金标准，用于 MEDLINE 中的作者去重。

J Am Med Inform Assoc. 2019 Oct 1;26(10):1037-1045. doi: 10.1093/jamia/ocz028.

Author Name Disambiguation for PubMed.PubMed的作者姓名消歧

J Assoc Inf Sci Technol. 2014 Apr;65(4):765-781. doi: 10.1002/asi.23063. Epub 2013 Nov 21.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Improving chemical entity recognition through h-index based semantic similarity.通过基于 h 指数的语义相似度提高化学实体识别。

J Cheminform. 2015 Jan 19;7(Suppl 1 Text mining for chemistry and the CHEMDNER track):S13. doi: 10.1186/1758-2946-7-S1-S13. eCollection 2015.

Implementation and evaluation of a multivariate abstraction-based, interval-based dynamic time-warping method as a similarity measure for longitudinal medical records.基于多元抽象和区间的动态时间规整方法的实现和评估，作为一种用于纵向医疗记录的相似性度量方法。

J Biomed Inform. 2021 Nov;123:103919. doi: 10.1016/j.jbi.2021.103919. Epub 2021 Oct 8.

Challenges in clinical natural language processing for automated disorder normalization.临床自然语言处理中自动疾病标准化的挑战。

J Biomed Inform. 2015 Oct;57:28-37. doi: 10.1016/j.jbi.2015.07.010. Epub 2015 Jul 14.

引用本文的文献

Democratic governance and global science: A longitudinal analysis of the international research collaboration network.民主治理与全球科学：国际研究合作网络的纵向分析。

PLoS One. 2023 Jun 13;18(6):e0287058. doi: 10.1371/journal.pone.0287058. eCollection 2023.

Vicious circles of gender bias, lower positions, and lower performance: Gender differences in scholarly productivity and impact.性别偏见、地位较低和表现较差的恶性循环：学术生产力和影响力方面的性别差异。

PLoS One. 2017 Aug 25;12(8):e0183301. doi: 10.1371/journal.pone.0183301. eCollection 2017.

Quantity and/or Quality? The Importance of Publishing Many Papers.数量还是质量？发表多篇论文的重要性。

PLoS One. 2016 Nov 21;11(11):e0166149. doi: 10.1371/journal.pone.0166149. eCollection 2016.

Multiple Citation Indicators and Their Composite across Scientific Disciplines.跨学科的多种引用指标及其综合指标

PLoS Biol. 2016 Jul 1;14(7):e1002501. doi: 10.1371/journal.pbio.1002501. eCollection 2016 Jul.

Do Nobel Laureates Create Prize-Winning Networks? An Analysis of Collaborative Research in Physiology or Medicine.诺贝尔奖得主会创造出获奖网络吗？对生理学或医学领域合作研究的分析。

PLoS One. 2015 Jul 31;10(7):e0134164. doi: 10.1371/journal.pone.0134164. eCollection 2015.

Tackling the "so what" problem in scientific research: a systems-based approach to resource and publication tracking.解决科学研究中的“那又怎样”问题：一种基于系统的资源与出版物跟踪方法。

Acad Med. 2015 Aug;90(8):1043-50. doi: 10.1097/ACM.0000000000000732.

本文引用的文献

Authorship criteria and disclosure of contributions: comparison of 3 general medical journals with different author contribution forms.作者资格标准与贡献披露：3种具有不同作者贡献形式的综合医学期刊的比较

JAMA. 2004 Jul 7;292(1):86-8. doi: 10.1001/jama.292.1.86.

Disclosure of researcher contributions: a study of original research articles in The Lancet.研究者贡献的披露：对《柳叶刀》上原创研究文章的一项研究。

Ann Intern Med. 1999 Apr 20;130(8):661-70. doi: 10.7326/0003-4819-130-8-199904200-00013.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用多方面相似性指标进行作者消歧。

Author disambiguation using multi-aspect similarity indicators.

作者信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献