• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Aggregating large-scale databases for PubMed author name disambiguation.为 PubMed 作者姓名消歧聚合大规模数据库。
J Am Med Inform Assoc. 2021 Aug 13;28(9):1919-1927. doi: 10.1093/jamia/ocab095.
2
Bridging the gap in author names: building an enhanced author name dataset for biomedical literature system.弥合作者姓名差异:构建生物医学文献系统的增强型作者姓名数据集。
J Am Med Inform Assoc. 2024 Aug 1;31(8):1648-1656. doi: 10.1093/jamia/ocae127.
3
A new approach and gold standard toward author disambiguation in MEDLINE.一种新的方法和金标准,用于 MEDLINE 中的作者去重。
J Am Med Inform Assoc. 2019 Oct 1;26(10):1037-1045. doi: 10.1093/jamia/ocz028.
4
Author Name Disambiguation for PubMed.PubMed的作者姓名消歧
J Assoc Inf Sci Technol. 2014 Apr;65(4):765-781. doi: 10.1002/asi.23063. Epub 2013 Nov 21.
5
Semantic persistence of ambiguous biomedical names in the citation network.生物医学命名歧义在引文网络中的语义持续性
Bioinformatics. 2020 Apr 1;36(7):2224-2228. doi: 10.1093/bioinformatics/btz923.
6
A Graph-Based Author Name Disambiguation Method and Analysis via Information Theory.一种基于图的作者姓名消歧方法及信息论分析
Entropy (Basel). 2020 Apr 7;22(4):416. doi: 10.3390/e22040416.
7
Data sets for author name disambiguation: an empirical analysis and a new resource.用于消除作者姓名歧义的数据集:实证分析与新资源。
Scientometrics. 2017;111(3):1467-1500. doi: 10.1007/s11192-017-2363-5. Epub 2017 Mar 27.
8
Author Name Disambiguation in MEDLINE.医学在线数据库(MEDLINE)中的作者姓名消歧
ACM Trans Knowl Discov Data. 2009 Jul 1;3(3). doi: 10.1145/1552303.1552304.
9
The strength of co-authorship in gene name disambiguation.共同作者在基因名称消歧中的作用强度。
BMC Bioinformatics. 2008 Jan 29;9:69. doi: 10.1186/1471-2105-9-69.
10
Biomedical word sense disambiguation with ontologies and metadata: automation meets accuracy.利用本体和元数据进行生物医学词义消歧:自动化与准确性的结合。
BMC Bioinformatics. 2009 Jan 21;10:28. doi: 10.1186/1471-2105-10-28.

引用本文的文献

1
Bridging the gap in author names: building an enhanced author name dataset for biomedical literature system.弥合作者姓名差异:构建生物医学文献系统的增强型作者姓名数据集。
J Am Med Inform Assoc. 2024 Aug 1;31(8):1648-1656. doi: 10.1093/jamia/ocae127.

为 PubMed 作者姓名消歧聚合大规模数据库。

Aggregating large-scale databases for PubMed author name disambiguation.

机构信息

School of Information Management, Wuhan University, Wuhan, China.

出版信息

J Am Med Inform Assoc. 2021 Aug 13;28(9):1919-1927. doi: 10.1093/jamia/ocab095.

DOI:10.1093/jamia/ocab095
PMID:34180522
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8363810/
Abstract

OBJECTIVE

PubMed has suffered from the author ambiguity problem for many years. Existing studies on author name disambiguation (AND) for PubMed only used internal metadata for development. However, some of them are incomplete (eg, a large number of names are only abbreviated and their full names are not available) or less discriminative. To this end, we present a new disambiguation method, namely AggAND, by aggregating information from external databases.

MATERIALS AND METHODS

We address this issue by exploring Microsoft Academic Graph, Semantic Scholar, and PubMed Knowledge Graph to enhance the built-in name metadata, and extend the internal metadata with some external and more discriminative metadata.

RESULTS

Experimental results on enhanced name metadata demonstrate comparable performance to 3 author identifier systems, as well as show superiority over the original name metadata. More importantly, our method, AggAND, incorporating both enhanced name and extended metadata, yields F1 scores of 95.80% and 93.71% on 2 datasets and outperforms the state-of-the-art method by a large margin (3.61% and 6.55%, respectively).

CONCLUSIONS

The feasibility and good performance of our methods not only help better understand the importance of external databases for disambiguation, but also point to a promising direction for future AND studies in which information aggregated from multiple bibliographic databases can be effective in improving disambiguation performance. The methodology shown here can be generalized to broader bibliographic databases beyond PubMed. Our code and data are available online (https://github.com/carmanzhang/PubMed-AND-method).

摘要

目的

PubMed 多年来一直存在作者歧义问题。现有的 PubMed 作者名称消歧(AND)研究仅使用内部元数据进行开发。然而,其中一些元数据不完整(例如,大量名称仅缩写,其全名不可用)或区分度较低。为此,我们通过探索 Microsoft Academic Graph、Semantic Scholar 和 PubMed Knowledge Graph 提出了一种新的消歧方法 AggAND,以聚合来自外部数据库的信息。

材料和方法

我们通过探索 Microsoft Academic Graph、Semantic Scholar 和 PubMed Knowledge Graph 来解决这个问题,以增强内置的名称元数据,并使用一些外部和更具区分度的元数据扩展内部元数据。

结果

增强名称元数据的实验结果表明,与 3 个作者标识符系统的性能相当,并且优于原始名称元数据。更重要的是,我们的方法 AggAND,结合了增强的名称和扩展的元数据,在 2 个数据集上的 F1 得分为 95.80%和 93.71%,明显优于最先进的方法(分别为 3.61%和 6.55%)。

结论

我们方法的可行性和良好性能不仅有助于更好地理解外部数据库对于消歧的重要性,而且为未来的 AND 研究指明了一个有希望的方向,即从多个书目数据库聚合信息可以有效地提高消歧性能。这里展示的方法可以推广到更广泛的书目数据库,而不仅仅是 PubMed。我们的代码和数据可在网上获取(https://github.com/carmanzhang/PubMed-AND-method)。