Lerchenmueller Marc J, Sorenson Olav
Yale School of Management, Yale University, New Haven, CT, United States of America.
PLoS One. 2016 Jul 1;11(7):e0158731. doi: 10.1371/journal.pone.0158731. eCollection 2016.
We examined the usefulness (precision) and completeness (recall) of the Author-ity author disambiguation for PubMed articles by associating articles with scientists funded by the National Institutes of Health (NIH). In doing so, we exploited established unique identifiers-Principal Investigator (PI) IDs-that the NIH assigns to funded scientists. Analyzing a set of 36,987 NIH scientists who received their first R01 grant between 1985 and 2009, we identified 355,921 articles appearing in PubMed that would allow us to evaluate the precision and recall of the Author-ity disambiguation. We found that Author-ity identified the NIH scientists with 99.51% precision across the articles. It had a corresponding recall of 99.64%. Precision and recall, moreover, appeared stable across common and uncommon last names, across ethnic backgrounds, and across levels of scientist productivity.
我们通过将文章与由美国国立卫生研究院(NIH)资助的科学家相关联,来检验用于PubMed文章的Author-ity作者消歧的有用性(精确率)和完整性(召回率)。在此过程中,我们利用了NIH分配给受资助科学家的既定唯一标识符——首席研究员(PI)ID。通过分析一组在1985年至2009年间获得首个R01资助的36,987名NIH科学家,我们确定了PubMed中出现的355,921篇文章,这些文章使我们能够评估Author-ity消歧的精确率和召回率。我们发现,Author-ity在这些文章中识别NIH科学家的精确率为99.51%。其相应的召回率为99.64%。此外,精确率和召回率在常见和不常见姓氏、不同种族背景以及不同科学家生产力水平之间似乎保持稳定。