• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

科学出版物的文章级别分类:深度学习、直接引文和文献耦合的比较。

Article-level classification of scientific publications: A comparison of deep learning, direct citation and bibliographic coupling.

机构信息

Science-Metrix Inc., Montréal, Québec, Canada.

Elsevier B.V., Amsterdam, Netherlands.

出版信息

PLoS One. 2021 May 11;16(5):e0251493. doi: 10.1371/journal.pone.0251493. eCollection 2021.

DOI:10.1371/journal.pone.0251493
PMID:33974653
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8112690/
Abstract

Classification schemes for scientific activity and publications underpin a large swath of research evaluation practices at the organizational, governmental, and national levels. Several research classifications are currently in use, and they require continuous work as new classification techniques becomes available and as new research topics emerge. Convolutional neural networks, a subset of "deep learning" approaches, have recently offered novel and highly performant methods for classifying voluminous corpora of text. This article benchmarks a deep learning classification technique on more than 40 million scientific articles and on tens of thousands of scholarly journals. The comparison is performed against bibliographic coupling-, direct citation-, and manual-based classifications-the established and most widely used approaches in the field of bibliometrics, and by extension, in many science and innovation policy activities such as grant competition management. The results reveal that the performance of this first iteration of a deep learning approach is equivalent to the graph-based bibliometric approaches. All methods presented are also on par with manual classification. Somewhat surprisingly, no machine learning approaches were found to clearly outperform the simple label propagation approach that is direct citation. In conclusion, deep learning is promising because it performed just as well as the other approaches but has more flexibility to be further improved. For example, a deep neural network incorporating information from the citation network is likely to hold the key to an even better classification algorithm.

摘要

科学活动和出版物的分类方案是组织、政府和国家各级研究评估实践的基础。目前有几种研究分类方法正在使用,随着新的分类技术的出现和新的研究课题的出现,它们需要不断的工作。卷积神经网络是“深度学习”方法的一个子集,最近为分类大量文本语料库提供了新颖且高性能的方法。本文对超过 4000 万篇科学文章和数万个学术期刊进行了深度学习分类技术的基准测试。与文献耦合、直接引文和基于手动的分类进行了比较-这是文献计量学领域以及许多科学和创新政策活动(如资助竞争管理)中最广泛使用的方法。结果表明,这种深度学习方法的第一次迭代的性能与基于图的文献计量方法相当。所呈现的所有方法也与手动分类相当。有些令人惊讶的是,没有发现任何机器学习方法明显优于直接引文的简单标签传播方法。总之,深度学习很有前途,因为它的表现与其他方法一样好,但具有更大的灵活性,可以进一步改进。例如,包含引文网络信息的深度神经网络可能是实现更好分类算法的关键。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/8112690/6652599b07be/pone.0251493.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/8112690/6e67c7cab312/pone.0251493.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/8112690/65b69f59deee/pone.0251493.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/8112690/32e07dd2de80/pone.0251493.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/8112690/2625e397e166/pone.0251493.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/8112690/6652599b07be/pone.0251493.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/8112690/6e67c7cab312/pone.0251493.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/8112690/65b69f59deee/pone.0251493.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/8112690/32e07dd2de80/pone.0251493.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/8112690/2625e397e166/pone.0251493.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/8112690/6652599b07be/pone.0251493.g005.jpg

相似文献

1
Article-level classification of scientific publications: A comparison of deep learning, direct citation and bibliographic coupling.科学出版物的文章级别分类:深度学习、直接引文和文献耦合的比较。
PLoS One. 2021 May 11;16(5):e0251493. doi: 10.1371/journal.pone.0251493. eCollection 2021.
2
SCINOBO: a novel system classifying scholarly communication in a dynamically constructed hierarchical Field-of-Science taxonomy.SCINOBO:一种在动态构建的分层科学领域分类法中对学术交流进行分类的新颖系统。
Front Res Metr Anal. 2023 May 4;8:1149834. doi: 10.3389/frma.2023.1149834. eCollection 2023.
3
Confirm or refute?: A comparative study on citation sentiment classification in clinical research publications.确认或反驳?:临床研究出版物中引文情绪分类的对比研究。
J Biomed Inform. 2019 Mar;91:103123. doi: 10.1016/j.jbi.2019.103123. Epub 2019 Feb 10.
4
Research Trends in the Application of Artificial Intelligence in Oncology: A Bibliometric and Network Visualization Study.人工智能在肿瘤学应用中的研究趋势:文献计量学和网络可视化研究。
Front Biosci (Landmark Ed). 2022 Aug 31;27(9):254. doi: 10.31083/j.fbl2709254.
5
Analyzing Diabetes Detection and Classification: A Bibliometric Review (2000-2023).分析糖尿病检测和分类:文献计量学综述(2000-2023)。
Sensors (Basel). 2024 Aug 19;24(16):5346. doi: 10.3390/s24165346.
6
Research hotspots and frontiers of machine learning in renal medicine: a bibliometric and visual analysis from 2013 to 2024.肾脏医学中机器学习的研究热点与前沿:2013年至2024年的文献计量学与可视化分析
Int Urol Nephrol. 2025 Mar;57(3):907-928. doi: 10.1007/s11255-024-04259-3. Epub 2024 Oct 30.
7
Medical long-tailed learning for imbalanced data: Bibliometric analysis.针对不平衡数据的医学长尾学习:文献计量分析
Comput Methods Programs Biomed. 2024 Apr;247:108106. doi: 10.1016/j.cmpb.2024.108106. Epub 2024 Feb 29.
8
New developments in the use of citation analysis in research evaluation.研究评价中引用分析应用的新进展。
Arch Immunol Ther Exp (Warsz). 2009 Jan-Feb;57(1):13-8. doi: 10.1007/s00005-009-0001-5. Epub 2009 Feb 14.
9
Analysis of international publication trends in artificial intelligence in skin cancer.皮肤癌人工智能领域国际出版趋势分析
Clin Dermatol. 2024 Nov-Dec;42(6):570-584. doi: 10.1016/j.clindermatol.2024.09.012. Epub 2024 Sep 10.
10
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

引用本文的文献

1
Production and impact of Italian researchers in physical-Sport education and sport pedagogy.意大利研究人员在体育教育和运动教学法方面的成果与影响。
Front Res Metr Anal. 2025 May 6;10:1478317. doi: 10.3389/frma.2025.1478317. eCollection 2025.
2
Trends in Physical Activity Research on Tobacco and/or Alcohol: A Bibliometric Analysis.烟草和/或酒精相关身体活动研究的趋势:文献计量分析
Healthcare (Basel). 2025 Feb 28;13(5):529. doi: 10.3390/healthcare13050529.
3
Impact of scientific production of Italian scientists in exercises and sport sciences by measuring the author-weighted -index.

本文引用的文献

1
A standardized citation metrics author database annotated for scientific field.标准化引文计量作者数据库,标注了科学领域。
PLoS Biol. 2019 Aug 12;17(8):e3000384. doi: 10.1371/journal.pbio.3000384. eCollection 2019 Aug.
2
The emergent integrated network structure of scientific research.科研的新兴综合网络结构。
PLoS One. 2019 Apr 30;14(4):e0216146. doi: 10.1371/journal.pone.0216146. eCollection 2019.
3
Hybrid self-optimized clustering model based on citation links and textual features to detect research topics.基于引用链接和文本特征的混合自优化聚类模型以检测研究主题。
通过测量作者加权指数来评估意大利科学家在运动与体育科学领域的科研产出影响。
Front Res Metr Anal. 2024 Nov 1;9:1466811. doi: 10.3389/frma.2024.1466811. eCollection 2024.
4
Visualization and Analysis of Urban Air Quality Management Using Bibliometric Techniques and Social Network Analysis for the Period 1975 to 2022: A Review.1975年至2022年期间运用文献计量技术和社会网络分析对城市空气质量治理进行的可视化与分析:综述
Environ Health Insights. 2024 May 15;18:11786302241252733. doi: 10.1177/11786302241252733. eCollection 2024.
5
SCINOBO: a novel system classifying scholarly communication in a dynamically constructed hierarchical Field-of-Science taxonomy.SCINOBO:一种在动态构建的分层科学领域分类法中对学术交流进行分类的新颖系统。
Front Res Metr Anal. 2023 May 4;8:1149834. doi: 10.3389/frma.2023.1149834. eCollection 2023.
6
TeamTree analysis: A new approach to evaluate scientific production.团队树分析:一种评估科研产出的新方法。
PLoS One. 2021 Jul 21;16(7):e0253847. doi: 10.1371/journal.pone.0253847. eCollection 2021.
PLoS One. 2017 Oct 27;12(10):e0187164. doi: 10.1371/journal.pone.0187164. eCollection 2017.
4
Clustering Scientific Publications Based on Citation Relations: A Systematic Comparison of Different Methods.基于引用关系的科学出版物聚类:不同方法的系统比较
PLoS One. 2016 Apr 28;11(4):e0154404. doi: 10.1371/journal.pone.0154404. eCollection 2016.
5
An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition.BIOASQ大规模生物医学语义索引与问答竞赛概述。
BMC Bioinformatics. 2015 Apr 30;16:138. doi: 10.1186/s12859-015-0564-6.
6
P-values as percentiles. Commentary on: "Null hypothesis significance tests. A mix-up of two different theories: the basis for widespread confusion and numerous misinterpretations".作为百分位数的P值。对《零假设显著性检验:两种不同理论的混淆——广泛混乱和众多误解的根源》的评论
Front Psychol. 2015 Apr 1;6:341. doi: 10.3389/fpsyg.2015.00341. eCollection 2015.
7
Design and update of a classification system: the UCSD map of science.设计与更新分类系统:圣迭戈加州大学科学图谱。
PLoS One. 2012;7(7):e39464. doi: 10.1371/journal.pone.0039464. Epub 2012 Jul 12.
8
Clustering more than two million biomedical publications: comparing the accuracies of nine text-based similarity approaches.对两百多万篇生物医学文献进行聚类:比较九种基于文本的相似度方法的准确性。
PLoS One. 2011 Mar 17;6(3):e18029. doi: 10.1371/journal.pone.0018029.