Hammad Muhammad, Babur Önder, Abdul Basit Hamid, van den Brand Mark
Eindhoven University of Technology, Eindhoven, Netherlands.
Wageningen University and Research, Wageningen, Netherlands.
PeerJ Comput Sci. 2021 Nov 9;7:e737. doi: 10.7717/peerj-cs.737. eCollection 2021.
Software developers frequently reuse source code from repositories as it saves development time and effort. Code clones (similar code fragments) accumulated in these repositories represent often repeated functionalities and are candidates for reuse in an exploratory or rapid development. To facilitate code clone reuse, we previously presented DeepClone, a novel deep learning approach for modeling code clones along with non-cloned code to predict the next set of tokens (possibly a complete clone method body) based on the code written so far. The probabilistic nature of language modeling, however, can lead to code output with minor syntax or logic errors. To resolve this, we propose a novel approach called Clone-Advisor. We apply an information retrieval technique on top of DeepClone output to recommend real clone methods closely matching the predicted clone method, thus improving the original output by DeepClone. In this paper we have discussed and refined our previous work on DeepClone in much more detail. Moreover, we have quantitatively evaluated the performance and effectiveness of Clone-Advisor in clone method recommendation.
软件开发人员经常从代码库中复用源代码,因为这样可以节省开发时间和精力。这些代码库中积累的代码克隆(相似的代码片段)通常代表着重复的功能,是探索性开发或快速开发中可复用的候选对象。为了促进代码克隆的复用,我们之前提出了DeepClone,这是一种新颖的深度学习方法,用于对代码克隆以及非克隆代码进行建模,以便根据到目前为止编写的代码预测下一组令牌(可能是一个完整的克隆方法体)。然而,语言建模的概率性质可能会导致代码输出出现一些小的语法或逻辑错误。为了解决这个问题,我们提出了一种名为Clone-Advisor的新颖方法。我们在DeepClone输出之上应用信息检索技术,以推荐与预测的克隆方法紧密匹配的实际克隆方法,从而改进DeepClone的原始输出。在本文中,我们更详细地讨论并完善了我们之前关于DeepClone的工作。此外,我们还对Clone-Advisor在克隆方法推荐中的性能和有效性进行了定量评估。