• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于循环神经网络在代码克隆研究中的应用的系统性文献回顾。

A systematic literature review on the applications of recurrent neural networks in code clone research.

机构信息

Department of Computer Science, University of Peshawar, Peshawar, Pakistan.

Department of Computer Science, Aden Community College, Aden, Yemen.

出版信息

PLoS One. 2024 Feb 2;19(2):e0296858. doi: 10.1371/journal.pone.0296858. eCollection 2024.

DOI:10.1371/journal.pone.0296858
PMID:38306372
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10836701/
Abstract

Code clones, referring to code fragments that are either similar or identical and are copied and pasted within software systems, have negative effects on both software quality and maintenance. The objective of this work is to systematically review and analyze recurrent neural network techniques used to detect code clones to shed light on the current techniques and offer valuable knowledge to the research community. Upon applying the review protocol, we have successfully identified 20 primary studies within this field from a total of 2099 studies. A deep investigation of these studies reveals that nine recurrent neural network techniques have been utilized for code clone detection, with a notable preference for LSTM techniques. These techniques have demonstrated their efficacy in detecting both syntactic and semantic clones, often utilizing abstract syntax trees for source code representation. Moreover, we observed that most studies applied evaluation metrics like F-score, precision, and recall. Additionally, these studies frequently utilized datasets extracted from open-source systems coded in Java and C programming languages. Notably, the Graph-LSTM technique exhibited superior performance. PyTorch and TensorFlow emerged as popular tools for implementing RNN models. To advance code clone detection research, further exploration of techniques like parallel LSTM, sentence-level LSTM, and Tree-Structured GRU is imperative. In addition, more research is needed to investigate the capabilities of the recurrent neural network techniques for identifying semantic clones across different programming languages and binary codes. The development of standardized benchmarks for languages like Python, Scratch, and C#, along with cross-language comparisons, is essential. Therefore, the utilization of recurrent neural network techniques for clone identification is a promising area that demands further research.

摘要

代码克隆是指在软件系统中复制和粘贴相似或相同的代码片段,它对软件质量和维护都有负面影响。本工作旨在系统地回顾和分析用于检测代码克隆的递归神经网络技术,以揭示当前技术,并为研究社区提供有价值的知识。通过应用审查协议,我们从总共 2099 项研究中成功确定了该领域的 20 项主要研究。对这些研究的深入调查表明,已经使用了九种递归神经网络技术来检测代码克隆,其中 LSTM 技术尤为受欢迎。这些技术已证明在检测语法和语义克隆方面非常有效,通常使用抽象语法树来表示源代码。此外,我们观察到大多数研究都应用了 F 分数、精度和召回率等评估指标。此外,这些研究经常使用从用 Java 和 C 编程语言编写的开源系统中提取的数据集。值得注意的是,Graph-LSTM 技术表现出了优越的性能。PyTorch 和 TensorFlow 成为实现 RNN 模型的流行工具。为了推进代码克隆检测研究,进一步探索并行 LSTM、句子级 LSTM 和 Tree-Structured GRU 等技术至关重要。此外,需要更多的研究来调查递归神经网络技术在不同编程语言和二进制代码中识别语义克隆的能力。开发 Python、Scratch 和 C#等语言的标准化基准以及跨语言比较是必不可少的。因此,递归神经网络技术在克隆识别中的应用是一个值得进一步研究的有前途的领域。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/d4f19995e40d/pone.0296858.g017.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/ac1b46121210/pone.0296858.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/d2e97d7abd98/pone.0296858.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/e54e73cfa5e8/pone.0296858.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/d97328b889e0/pone.0296858.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/3ec1e3cff75c/pone.0296858.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/8d94bc678b07/pone.0296858.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/2cedbef5c16d/pone.0296858.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/4d32a10df209/pone.0296858.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/c36f5b87e710/pone.0296858.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/9284f68d0e18/pone.0296858.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/8f3f6e906342/pone.0296858.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/976aef96d5d1/pone.0296858.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/fc9eaac7d30b/pone.0296858.g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/517cea689134/pone.0296858.g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/f15540055ae5/pone.0296858.g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/7549add2fdf0/pone.0296858.g016.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/d4f19995e40d/pone.0296858.g017.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/ac1b46121210/pone.0296858.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/d2e97d7abd98/pone.0296858.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/e54e73cfa5e8/pone.0296858.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/d97328b889e0/pone.0296858.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/3ec1e3cff75c/pone.0296858.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/8d94bc678b07/pone.0296858.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/2cedbef5c16d/pone.0296858.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/4d32a10df209/pone.0296858.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/c36f5b87e710/pone.0296858.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/9284f68d0e18/pone.0296858.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/8f3f6e906342/pone.0296858.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/976aef96d5d1/pone.0296858.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/fc9eaac7d30b/pone.0296858.g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/517cea689134/pone.0296858.g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/f15540055ae5/pone.0296858.g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/7549add2fdf0/pone.0296858.g016.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54b3/10836701/d4f19995e40d/pone.0296858.g017.jpg

相似文献

1
A systematic literature review on the applications of recurrent neural networks in code clone research.基于循环神经网络在代码克隆研究中的应用的系统性文献回顾。
PLoS One. 2024 Feb 2;19(2):e0296858. doi: 10.1371/journal.pone.0296858. eCollection 2024.
2
A novel code representation for detecting Java code clones using high-level and abstract compiled code representations.一种使用高级和抽象的编译代码表示来检测 Java 代码克隆的新代码表示方法。
PLoS One. 2024 May 10;19(5):e0302333. doi: 10.1371/journal.pone.0302333. eCollection 2024.
3
Software defect prediction using hybrid model (CBIL) of convolutional neural network (CNN) and bidirectional long short-term memory (Bi-LSTM).使用卷积神经网络(CNN)和双向长短期记忆网络(Bi-LSTM)的混合模型(CBIL)进行软件缺陷预测。
PeerJ Comput Sci. 2021 Nov 16;7:e739. doi: 10.7717/peerj-cs.739. eCollection 2021.
4
Character gated recurrent neural networks for Arabic sentiment analysis.基于字符门控循环神经网络的阿拉伯语情感分析。
Sci Rep. 2022 Jun 13;12(1):9779. doi: 10.1038/s41598-022-13153-w.
5
SFN: A Novel Scalable Feature Network for Vulnerability Representation of Open-Source Codes.SFN:一种用于开源代码漏洞表示的新型可扩展特征网络。
Comput Intell Neurosci. 2022 Aug 12;2022:2998448. doi: 10.1155/2022/2998448. eCollection 2022.
6
Application of Dual-Channel Convolutional Neural Network Algorithm in Semantic Feature Analysis of English Text Big Data.双通道卷积神经网络算法在英文文本大数据语义特征分析中的应用。
Comput Intell Neurosci. 2021 Nov 6;2021:7085412. doi: 10.1155/2021/7085412. eCollection 2021.
7
Mining e-cigarette adverse events in social media using Bi-LSTM recurrent neural network with word embedding representation.利用带有词嵌入表示的 Bi-LSTM 递归神经网络挖掘社交媒体中的电子烟不良事件。
J Am Med Inform Assoc. 2018 Jan 1;25(1):72-80. doi: 10.1093/jamia/ocx045.
8
Code generation: a strategy for neural network simulators.代码生成:神经网络模拟器的一种策略。
Neuroinformatics. 2010 Oct;8(3):183-96. doi: 10.1007/s12021-010-9082-x.
9
Attention based GRU-LSTM for software defect prediction.基于注意力机制的 GRU-LSTM 在软件缺陷预测中的应用。
PLoS One. 2021 Mar 4;16(3):e0247444. doi: 10.1371/journal.pone.0247444. eCollection 2021.
10
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

引用本文的文献

1
A novel code representation for detecting Java code clones using high-level and abstract compiled code representations.一种使用高级和抽象的编译代码表示来检测 Java 代码克隆的新代码表示方法。
PLoS One. 2024 May 10;19(5):e0302333. doi: 10.1371/journal.pone.0302333. eCollection 2024.

本文引用的文献

1
Review of deep learning: concepts, CNN architectures, challenges, applications, future directions.深度学习综述:概念、卷积神经网络架构、挑战、应用及未来方向。
J Big Data. 2021;8(1):53. doi: 10.1186/s40537-021-00444-8. Epub 2021 Mar 31.
2
The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation.马修斯相关系数(MCC)在二分类评估中优于 F1 得分和准确率的优势。
BMC Genomics. 2020 Jan 2;21(1):6. doi: 10.1186/s12864-019-6413-7.
3
Weka machine learning for predicting the phospholipidosis inducing potential.
用于预测磷脂沉积症诱导潜力的Weka机器学习。
Curr Top Med Chem. 2008;8(18):1691-709. doi: 10.2174/156802608786786589.
4
Learning long-term dependencies with gradient descent is difficult.使用梯度下降法学习长期依赖关系是困难的。
IEEE Trans Neural Netw. 1994;5(2):157-66. doi: 10.1109/72.279181.
5
Framewise phoneme classification with bidirectional LSTM and other neural network architectures.使用双向长短期记忆网络和其他神经网络架构进行逐帧音素分类。
Neural Netw. 2005 Jun-Jul;18(5-6):602-10. doi: 10.1016/j.neunet.2005.06.042.
6
Learning to forget: continual prediction with LSTM.学习遗忘:使用长短期记忆网络进行持续预测。
Neural Comput. 2000 Oct;12(10):2451-71. doi: 10.1162/089976600300015015.
7
Long short-term memory.长短期记忆
Neural Comput. 1997 Nov 15;9(8):1735-80. doi: 10.1162/neco.1997.9.8.1735.