• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于神经网络的网络钓鱼网址检测:一项实证研究。

Phishing URL detection with neural networks: an empirical study.

作者信息

Ghalechyan Hayk, Israyelyan Elina, Arakelyan Avag, Hovhannisyan Gerasim, Davtyan Arman

机构信息

EasyDMARC, Data Science, 0014, Yerevan, Armenia.

出版信息

Sci Rep. 2024 Oct 24;14(1):25134. doi: 10.1038/s41598-024-74725-6.

DOI:10.1038/s41598-024-74725-6
PMID:39448673
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11502860/
Abstract

Cybercriminals create phishing websites that mimic legitimate websites to get sensitive information from companies, individuals, or governments. Therefore, using state-of-the-art artificial intelligence and machine learning technologies to correctly classify phishing and legitimate URLs is imperative. We report the results of applying deterministic and probabilistic neural network models to URL classification. Key achievements of this work are: (1) The development of a unique approach based on probabilistic neural networks that improves classification accuracy. (2) We show for the first time in URL phishing research that a machine learning model trained on a combination of open source and private datasets is successful in production. The dataset is constructed from open sources like Alexa, PhishTank, or OpenPhish and, most importantly, real-world production data from EasyDMARC. The daily validation of the model using daily reported URL data and corresponding labels, both from open-source platforms and private production, reach on average a 97% accuracy on the validation dataset, labeled by PhishTank, OpenPhish and EasdDMARC where possible mislabeled data can not be excluded and was not possible to check due to large number of URLs. Feature engineering was done without third-party dependencies. Lastly, the evaluation of both deterministic and probabilistic models shows high accuracy on short and long URLs, where short URLs are defined as having less than 50 characters.

摘要

网络犯罪分子创建仿冒合法网站的网络钓鱼网站,以获取公司、个人或政府的敏感信息。因此,使用最先进的人工智能和机器学习技术来正确分类网络钓鱼和合法网址势在必行。我们报告了将确定性和概率性神经网络模型应用于网址分类的结果。这项工作的主要成果包括:(1)开发了一种基于概率神经网络的独特方法,提高了分类准确率。(2)我们在网址网络钓鱼研究中首次表明,在开源数据集和私有数据集组合上训练的机器学习模型在实际应用中取得了成功。该数据集由Alexa、PhishTank或OpenPhish等开源数据构建,最重要的是,还包括来自EasyDMARC的真实生产数据。使用来自开源平台和私有生产的每日报告的网址数据及相应标签对模型进行每日验证,在由PhishTank、OpenPhish和EasdDMARC标记的验证数据集上平均达到97%的准确率,其中可能存在误标记的数据无法排除,且由于网址数量众多无法进行检查。特征工程在没有第三方依赖的情况下完成。最后,对确定性模型和概率性模型的评估在短网址和长网址上均显示出高精度,其中短网址定义为字符数少于50个的网址。

相似文献

1
Phishing URL detection with neural networks: an empirical study.基于神经网络的网络钓鱼网址检测:一项实证研究。
Sci Rep. 2024 Oct 24;14(1):25134. doi: 10.1038/s41598-024-74725-6.
2
A Deep Learning-Based Innovative Technique for Phishing Detection in Modern Security with Uniform Resource Locators.基于深度学习的现代安全中基于统一资源定位器的网络钓鱼检测创新技术。
Sensors (Basel). 2023 Apr 30;23(9):4403. doi: 10.3390/s23094403.
3
A Hybrid Approach for Alluring Ads Phishing Attack Detection Using Machine Learning.一种使用机器学习的诱人广告网络钓鱼攻击检测混合方法。
Sensors (Basel). 2023 Sep 25;23(19):8070. doi: 10.3390/s23198070.
4
Applications of deep learning for phishing detection: a systematic literature review.深度学习在网络钓鱼检测中的应用:一项系统的文献综述。
Knowl Inf Syst. 2022;64(6):1457-1500. doi: 10.1007/s10115-022-01672-x. Epub 2022 May 23.
5
Phishing Website Detection Based on Deep Convolutional Neural Network and Random Forest Ensemble Learning.基于深度卷积神经网络和随机森林集成学习的钓鱼网站检测。
Sensors (Basel). 2021 Dec 10;21(24):8281. doi: 10.3390/s21248281.
6
An intelligent identification and classification system for malicious uniform resource locators (URLs).一种针对恶意统一资源定位符(URL)的智能识别与分类系统。
Neural Comput Appl. 2023 Apr 20:1-17. doi: 10.1007/s00521-023-08592-z.
7
Phishing URLs Detection Using Sequential and Parallel ML Techniques: Comparative Analysis.利用序列和并行 ML 技术检测网络钓鱼 URL:比较分析。
Sensors (Basel). 2023 Mar 26;23(7):3467. doi: 10.3390/s23073467.
8
Behind the Bait: Delving into hidden data.诱饵背后:深入挖掘隐藏数据。
Data Brief. 2023 Dec 15;52:109959. doi: 10.1016/j.dib.2023.109959. eCollection 2024 Feb.
9
A hybrid DNN-LSTM model for detecting phishing URLs.一种用于检测网络钓鱼网址的深度神经网络与长短期记忆网络混合模型。
Neural Comput Appl. 2023;35(7):4957-4973. doi: 10.1007/s00521-021-06401-z. Epub 2021 Aug 8.
10
Cyber Threat Intelligence-Based Malicious URL Detection Model Using Ensemble Learning.基于网络威胁情报的集成学习恶意 URL 检测模型。
Sensors (Basel). 2022 Apr 28;22(9):3373. doi: 10.3390/s22093373.

本文引用的文献

1
An effective detection approach for phishing websites using URL and HTML features.一种利用 URL 和 HTML 特征的有效钓鱼网站检测方法。
Sci Rep. 2022 May 25;12(1):8842. doi: 10.1038/s41598-022-10841-5.
2
APuML: An Efficient Approach to Detect Mobile Phishing Webpages using Machine Learning.APuML:一种使用机器学习检测移动网络钓鱼网页的高效方法。
Wirel Pers Commun. 2022;125(4):3227-3248. doi: 10.1007/s11277-022-09707-w. Epub 2022 May 2.
3
How Good Are We at Detecting a Phishing Attack? Investigating the Evolving Phishing Attack Email and Why It Continues to Successfully Deceive Society.
我们在检测网络钓鱼攻击方面有多出色?调查不断演变的网络钓鱼攻击电子邮件及其持续成功欺骗社会的原因。
SN Comput Sci. 2022;3(2):170. doi: 10.1007/s42979-022-01069-1. Epub 2022 Feb 23.
4
Identifying and Mitigating Phishing Attack Threats in IoT Use Cases Using a Threat Modelling Approach.利用威胁建模方法识别和缓解物联网用例中的网络钓鱼攻击威胁。
Sensors (Basel). 2021 Jul 14;21(14):4816. doi: 10.3390/s21144816.