• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

IoTSim:面向物联网的具有多块关系的二进制代码相似性检测

IoTSim: Internet of Things-Oriented Binary Code Similarity Detection with Multiple Block Relations.

作者信息

Luo Zhenhao, Wang Pengfei, Xie Wei, Zhou Xu, Wang Baosheng

机构信息

College of Computer, National University of Defense Technology, Changsha 410073, China.

出版信息

Sensors (Basel). 2023 Sep 11;23(18):7789. doi: 10.3390/s23187789.

DOI:10.3390/s23187789
PMID:37765846
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10535887/
Abstract

Binary code similarity detection (BCSD) plays a crucial role in various computer security applications, including vulnerability detection, malware detection, and software component analysis. With the development of the Internet of Things (IoT), there are many binaries from different instruction architecture sets, which require BCSD approaches robust against different architectures. In this study, we propose a novel IoT-oriented binary code similarity detection approach. Our approach leverages a customized transformer-based language model with disentangled attention to capture relative position information. To mitigate out-of-vocabulary (OOV) challenges in the language model, we introduce a base-token prediction pre-training task aimed at capturing basic semantics for unseen tokens. During function embedding generation, we integrate directed jumps, data dependency, and address adjacency to capture multiple block relations. We then assign different weights to different relations and use multi-layer Graph Convolutional Networks (GCN) to generate function embeddings. We implemented the prototype of IoTSim. Our experimental results show that our proposed block relation matrix improves IoTSim with large margins. With a pool size of 103, IoTSim achieves a recall@1 of 0.903 across architectures, outperforming the state-of-the-art approaches Trex, SAFE, and PalmTree.

摘要

二进制代码相似度检测(BCSD)在各种计算机安全应用中起着至关重要的作用,包括漏洞检测、恶意软件检测和软件组件分析。随着物联网(IoT)的发展,存在许多来自不同指令架构集的二进制文件,这就需要BCSD方法对不同架构具有鲁棒性。在本研究中,我们提出了一种新颖的面向物联网的二进制代码相似度检测方法。我们的方法利用了一种定制的基于Transformer的语言模型,通过解缠注意力来捕获相对位置信息。为了缓解语言模型中的词汇外(OOV)挑战,我们引入了一个基础令牌预测预训练任务,旨在捕获未见过的令牌的基本语义。在函数嵌入生成过程中,我们整合了定向跳转、数据依赖和地址邻接关系,以捕获多个块关系。然后,我们为不同的关系分配不同的权重,并使用多层图卷积网络(GCN)来生成函数嵌入。我们实现了IoTSim的原型。我们的实验结果表明,我们提出的块关系矩阵极大地改进了IoTSim。在池大小为103的情况下,IoTSim在所有架构上的召回率@1达到0.903,优于现有技术方法Trex、SAFE和PalmTree。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbfd/10535887/75bfeaf52471/sensors-23-07789-g011a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbfd/10535887/be3501ecff70/sensors-23-07789-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbfd/10535887/681efe5bd5f5/sensors-23-07789-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbfd/10535887/18cbd2f45c69/sensors-23-07789-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbfd/10535887/7686e267af63/sensors-23-07789-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbfd/10535887/5c4bf11f2351/sensors-23-07789-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbfd/10535887/a2e8fcd778cc/sensors-23-07789-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbfd/10535887/0f34b3799411/sensors-23-07789-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbfd/10535887/4247976226cb/sensors-23-07789-g008a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbfd/10535887/c383d73eea2c/sensors-23-07789-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbfd/10535887/9d19cf0acd85/sensors-23-07789-g010a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbfd/10535887/75bfeaf52471/sensors-23-07789-g011a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbfd/10535887/be3501ecff70/sensors-23-07789-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbfd/10535887/681efe5bd5f5/sensors-23-07789-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbfd/10535887/18cbd2f45c69/sensors-23-07789-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbfd/10535887/7686e267af63/sensors-23-07789-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbfd/10535887/5c4bf11f2351/sensors-23-07789-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbfd/10535887/a2e8fcd778cc/sensors-23-07789-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbfd/10535887/0f34b3799411/sensors-23-07789-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbfd/10535887/4247976226cb/sensors-23-07789-g008a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbfd/10535887/c383d73eea2c/sensors-23-07789-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbfd/10535887/9d19cf0acd85/sensors-23-07789-g010a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbfd/10535887/75bfeaf52471/sensors-23-07789-g011a.jpg

相似文献

1
IoTSim: Internet of Things-Oriented Binary Code Similarity Detection with Multiple Block Relations.IoTSim:面向物联网的具有多块关系的二进制代码相似性检测
Sensors (Basel). 2023 Sep 11;23(18):7789. doi: 10.3390/s23187789.
2
Semantic aware-based instruction embedding for binary code similarity detection.基于语义感知的指令嵌入的二进制代码相似性检测。
PLoS One. 2024 Jun 11;19(6):e0305299. doi: 10.1371/journal.pone.0305299. eCollection 2024.
3
Cross-platform binary code similarity detection based on NMT and graph embedding.基于神经机器翻译和图嵌入的跨平台二进制代码相似度检测
Math Biosci Eng. 2021 May 25;18(4):4528-4551. doi: 10.3934/mbe.2021230.
4
IoT malware detection architecture using a novel channel boosted and squeezed CNN.使用新型通道增强与压缩卷积神经网络的物联网恶意软件检测架构
Sci Rep. 2022 Sep 15;12(1):15498. doi: 10.1038/s41598-022-18936-9.
5
Multi-semantic feature fusion attention network for binary code similarity detection.多语义特征融合注意力网络用于二进制代码相似度检测。
Sci Rep. 2023 Mar 12;13(1):4096. doi: 10.1038/s41598-023-31280-w.
6
A Novel Detection and Multi-Classification Approach for IoT-Malware Using Random Forest Voting of Fine-Tuning Convolutional Neural Networks.基于卷积神经网络微调随机森林投票的物联网恶意软件新型检测与多分类方法。
Sensors (Basel). 2022 Jun 6;22(11):4302. doi: 10.3390/s22114302.
7
MAMF-GCN: Multi-scale adaptive multi-channel fusion deep graph convolutional network for predicting mental disorder.MAMF-GCN:用于预测精神障碍的多尺度自适应多通道融合深度图卷积网络。
Comput Biol Med. 2022 Sep;148:105823. doi: 10.1016/j.compbiomed.2022.105823. Epub 2022 Jul 6.
8
iDetect for vulnerability detection in internet of things operating systems using machine learning.使用机器学习进行物联网操作系统漏洞检测的 iDetect。
Sci Rep. 2022 Oct 12;12(1):17086. doi: 10.1038/s41598-022-21325-x.
9
MDABP: A Novel Approach to Detect Cross-Architecture IoT Malware Based on PaaS.MDABP:一种基于 PaaS 的新型跨体系结构 IoT 恶意软件检测方法。
Sensors (Basel). 2023 Mar 13;23(6):3060. doi: 10.3390/s23063060.
10
Artificial intelligence-driven malware detection framework for internet of things environment.用于物联网环境的人工智能驱动的恶意软件检测框架。
PeerJ Comput Sci. 2023 May 29;9:e1366. doi: 10.7717/peerj-cs.1366. eCollection 2023.

引用本文的文献

1
MSSA: multi-stage semantic-aware neural network for binary code similarity detection.MSSA:用于二进制代码相似性检测的多阶段语义感知神经网络。
PeerJ Comput Sci. 2025 Jan 17;11:e2504. doi: 10.7717/peerj-cs.2504. eCollection 2025.