• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种基于图的作者姓名消歧方法及信息论分析

A Graph-Based Author Name Disambiguation Method and Analysis via Information Theory.

作者信息

Ma Yingying, Wu Youlong, Lu Chengqiang

机构信息

School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China.

Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China.

出版信息

Entropy (Basel). 2020 Apr 7;22(4):416. doi: 10.3390/e22040416.

DOI:10.3390/e22040416
PMID:33286190
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7516896/
Abstract

Name ambiguity, due to the fact that many people share an identical name, often deteriorates the performance of information integration, document retrieval and web search. In academic data analysis, author name ambiguity usually decreases the analysis performance. To solve this problem, an author name disambiguation task is designed to divide documents related to an author name reference into several parts and each part is associated with a real-life person. Existing methods usually use either attributes of documents or relationships between documents and co-authors. However, methods of feature extraction using attributes cause inflexibility of models while solutions based on relationship graph network ignore the information contained in the features. In this paper, we propose a novel name disambiguation model based on representation learning which incorporates attributes and relationships. Experiments on a public real dataset demonstrate the effectiveness of our model and experimental results demonstrate that our solution is superior to several state-of-the-art graph-based methods. We also increase the interpretability of our method through information theory and show that the analysis could be helpful for model selection and training progress.

摘要

由于许多人共用相同的名字,姓名歧义常常会降低信息整合、文档检索和网络搜索的性能。在学术数据分析中,作者姓名歧义通常会降低分析性能。为了解决这个问题,设计了一项作者姓名消歧任务,将与作者姓名引用相关的文档分成几个部分,每个部分都与一个真实的人相关联。现有方法通常使用文档的属性或文档与共同作者之间的关系。然而,使用属性进行特征提取的方法会导致模型缺乏灵活性,而基于关系图网络的解决方案则忽略了特征中包含的信息。在本文中,我们提出了一种基于表示学习的新颖姓名消歧模型,该模型结合了属性和关系。在一个公共真实数据集上的实验证明了我们模型的有效性,实验结果表明我们的解决方案优于几种基于图的最新方法。我们还通过信息论提高了我们方法的可解释性,并表明该分析有助于模型选择和训练过程。

相似文献

1
A Graph-Based Author Name Disambiguation Method and Analysis via Information Theory.一种基于图的作者姓名消歧方法及信息论分析
Entropy (Basel). 2020 Apr 7;22(4):416. doi: 10.3390/e22040416.
2
Author Name Disambiguation for PubMed.PubMed的作者姓名消歧
J Assoc Inf Sci Technol. 2014 Apr;65(4):765-781. doi: 10.1002/asi.23063. Epub 2013 Nov 21.
3
Aggregating large-scale databases for PubMed author name disambiguation.为 PubMed 作者姓名消歧聚合大规模数据库。
J Am Med Inform Assoc. 2021 Aug 13;28(9):1919-1927. doi: 10.1093/jamia/ocab095.
4
The strength of co-authorship in gene name disambiguation.共同作者在基因名称消歧中的作用强度。
BMC Bioinformatics. 2008 Jan 29;9:69. doi: 10.1186/1471-2105-9-69.
5
A new approach and gold standard toward author disambiguation in MEDLINE.一种新的方法和金标准,用于 MEDLINE 中的作者去重。
J Am Med Inform Assoc. 2019 Oct 1;26(10):1037-1045. doi: 10.1093/jamia/ocz028.
6
Graph-based methods for Author Name Disambiguation: a survey.基于图的作者姓名消歧方法:一项综述。
PeerJ Comput Sci. 2023 Sep 11;9:e1536. doi: 10.7717/peerj-cs.1536. eCollection 2023.
7
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
8
Author Name Disambiguation in MEDLINE.医学在线数据库(MEDLINE)中的作者姓名消歧
ACM Trans Knowl Discov Data. 2009 Jul 1;3(3). doi: 10.1145/1552303.1552304.
9
A Knowledge Graph Entity Disambiguation Method Based on Entity-Relationship Embedding and Graph Structure Embedding.基于实体关系嵌入和图结构嵌入的知识图谱实体消歧方法。
Comput Intell Neurosci. 2021 Sep 23;2021:2878189. doi: 10.1155/2021/2878189. eCollection 2021.
10
Data sets for author name disambiguation: an empirical analysis and a new resource.用于消除作者姓名歧义的数据集:实证分析与新资源。
Scientometrics. 2017;111(3):1467-1500. doi: 10.1007/s11192-017-2363-5. Epub 2017 Mar 27.

本文引用的文献

1
Utilizing Information Bottleneck to Evaluate the Capability of Deep Neural Networks for Image Classification.利用信息瓶颈评估深度神经网络的图像分类能力。
Entropy (Basel). 2019 May 1;21(5):456. doi: 10.3390/e21050456.