• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用内容和按节计算的内文引文计数来进行重要引文识别。

Important citation identification by exploiting content and section-wise in-text citation count.

机构信息

Department of Computer Science, National Textile University, Faisalabad, Pakistan.

Punjab University College of Information Technology (PUCIT), University of the Punjab (PU), Lahore, Pakistan.

出版信息

PLoS One. 2020 Mar 5;15(3):e0228885. doi: 10.1371/journal.pone.0228885. eCollection 2020.

DOI:10.1371/journal.pone.0228885
PMID:32134940
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7058319/
Abstract

A citation is deemed as a potential parameter to determine linkage between research articles. The parameter has extensively been employed to form multifarious academic aspects like calculating the impact factor of journals, h-Index of researchers, allocate different research grants, find the latest research trends, etc. The current state-of-the-art contends that all citations are not of equal importance. Based on this argument, the current trend in citation classification community categorizes citations into important and non-important reasons. The community has proposed different approaches to extract important citations such as citation count, context-based, metadata, and textual based approaches. The contemporary state-of-the-art in citation classification community ignores significantly potential features that can play a vital role in citation classification. This research presents a novel approach for binary citation classification by exploiting section-wise in-text citation frequencies, similarity score, and overall citation count-based features. The study also introduces machine learning algorithms based novel approach for assigning appropriate weights to the logical sections of research papers. The weights are allocated to the citations with respect to their sections. To perform the classification, we used three classification techniques, Support Vector Machine, Kernel Linear Regression, and Random Forest. The experiment was performed on two annotated benchmark datasets that contain 465 and 311 citation pairs of research articles respectively. The results revealed that the proposed approach attained an improved value of precision (i.e., 0.84 vs 0.72) from contemporary state-of-the-art approach.

摘要

引文被认为是确定研究文章之间联系的潜在参数。该参数广泛用于形成各种学术方面,例如计算期刊的影响因子、研究人员的 h 指数、分配不同的研究资金、发现最新的研究趋势等。目前的观点认为,并非所有引文都具有同等重要性。基于这一论点,引文分类社区目前的趋势将引文分为重要和非重要原因。该社区已经提出了不同的方法来提取重要引文,例如引文计数、基于上下文、元数据和基于文本的方法。引文分类社区目前的最新技术忽略了可能在引文分类中发挥重要作用的潜在特征。本研究通过利用文本内按节引用频率、相似度得分和整体引用计数的特征,提出了一种新的二进制引文分类方法。该研究还介绍了基于机器学习算法的新方法,为研究论文的逻辑部分分配适当的权重。根据其所属的部分为引文分配权重。为了进行分类,我们使用了三种分类技术,支持向量机、核线性回归和随机森林。实验在两个分别包含 465 和 311 对研究文章引文的标注基准数据集上进行。结果表明,与当前最先进的方法相比,所提出的方法在精度方面的提高了 0.84 个百分点(即 0.72 与 0.84)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c606/7058319/c927f278f8cc/pone.0228885.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c606/7058319/071b6c4be16d/pone.0228885.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c606/7058319/7677b47fcc8a/pone.0228885.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c606/7058319/eb82966ccb68/pone.0228885.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c606/7058319/ef02ec1d5a47/pone.0228885.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c606/7058319/61a6f3cf82f1/pone.0228885.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c606/7058319/bcd03a2e4a29/pone.0228885.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c606/7058319/c927f278f8cc/pone.0228885.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c606/7058319/071b6c4be16d/pone.0228885.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c606/7058319/7677b47fcc8a/pone.0228885.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c606/7058319/eb82966ccb68/pone.0228885.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c606/7058319/ef02ec1d5a47/pone.0228885.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c606/7058319/61a6f3cf82f1/pone.0228885.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c606/7058319/bcd03a2e4a29/pone.0228885.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c606/7058319/c927f278f8cc/pone.0228885.g007.jpg

相似文献

1
Important citation identification by exploiting content and section-wise in-text citation count.利用内容和按节计算的内文引文计数来进行重要引文识别。
PLoS One. 2020 Mar 5;15(3):e0228885. doi: 10.1371/journal.pone.0228885. eCollection 2020.
2
Models for predicting and explaining citation count of biomedical articles.预测和解释生物医学文章被引频次的模型。
AMIA Annu Symp Proc. 2008 Nov 6;2008:222-6.
3
In-text citation's frequencies-based recommendations of relevant research papers.基于文中引用频率的相关研究论文推荐。
PeerJ Comput Sci. 2021 Jun 4;7:e524. doi: 10.7717/peerj-cs.524. eCollection 2021.
4
Impact Factors and Prediction of Popular Topics in a Journal.期刊中热门话题的影响因素及预测
Ultraschall Med. 2016 Aug;37(4):343-5. doi: 10.1055/s-0042-111209. Epub 2016 Aug 4.
5
Citation classics in suicide and life threatening behavior: a research note.自杀与生命威胁行为中的引用经典:一项研究报告
Suicide Life Threat Behav. 2012 Dec;42(6):628-39. doi: 10.1111/j.1943-278X.2012.00117.x. Epub 2012 Sep 1.
6
Scientific text citation analysis using CNN features and ensemble learning model.基于 CNN 特征和集成学习模型的科技文本引文分析
PLoS One. 2024 May 28;19(5):e0302304. doi: 10.1371/journal.pone.0302304. eCollection 2024.
7
The hundred most cited publications in orthopaedic hip research - a bibliometric analysis.骨科髋关节研究中被引用次数最多的100篇出版物——一项文献计量分析。
Hip Int. 2016 Mar-Apr;26(2):199-208. doi: 10.5301/hipint.5000322. Epub 2016 Feb 25.
8
Use of a Machine-learning Method for Predicting Highly Cited Articles Within General Radiology Journals.一种用于预测普通放射学期刊中高被引文章的机器学习方法的应用。
Acad Radiol. 2016 Dec;23(12):1573-1581. doi: 10.1016/j.acra.2016.08.011. Epub 2016 Sep 28.
9
Automatic identification of high impact articles in PubMed to support clinical decision making.在PubMed中自动识别高影响力文章以支持临床决策。
J Biomed Inform. 2017 Sep;73:95-103. doi: 10.1016/j.jbi.2017.07.015. Epub 2017 Jul 26.
10
Improved citation status of World Journal Gastroenterology in 2004: Analysis of all reference citations by WJG and citations of WJG articles by other SCI journals during 1998-2004.《世界胃肠病学杂志》2004年被引情况改善:对1998 - 2004年期间《世界胃肠病学杂志》的所有参考文献引用情况以及其他SCI期刊对《世界胃肠病学杂志》文章的引用情况分析
World J Gastroenterol. 2005 Jan 7;11(1):1-6. doi: 10.3748/wjg.v11.i1.1.

引用本文的文献

1
Optimising window size of semantic of classification model for identification of in-text citations based on context and intent.基于上下文和意图优化用于识别文本中引用的分类模型语义窗口大小。
PLoS One. 2025 Mar 24;20(3):e0309862. doi: 10.1371/journal.pone.0309862. eCollection 2025.
2
Machine learning based framework for fine-grained word segmentation and enhanced text normalization for low resourced language.基于机器学习的低资源语言细粒度分词与增强文本规范化框架。
PeerJ Comput Sci. 2024 Jan 31;10:e1704. doi: 10.7717/peerj-cs.1704. eCollection 2024.
3
A Bibliometric Analysis of Post-COVID-19 Syndrome.

本文引用的文献

1
Global multi-level analysis of the 'scientific food web'.全球多层次分析“科学食物链网络”。
Sci Rep. 2013;3:1167. doi: 10.1038/srep01167. Epub 2013 Jan 30.
2
Automatically classifying the role of citations in biomedical articles.自动分类生物医学文章中引用的作用。
AMIA Annu Symp Proc. 2010 Nov 13;2010:11-5.
3
An index to quantify an individual's scientific research output.一个用于量化个人科研产出的指标。
新冠后综合征的文献计量分析
J Multidiscip Healthc. 2024 Aug 29;17:4213-4221. doi: 10.2147/JMDH.S477256. eCollection 2024.
4
Semi-automated Tools for Systematic Searches.半自动化系统检索工具。
Methods Mol Biol. 2022;2345:17-40. doi: 10.1007/978-1-0716-1566-9_2.
Proc Natl Acad Sci U S A. 2005 Nov 15;102(46):16569-72. doi: 10.1073/pnas.0507655102. Epub 2005 Nov 7.