• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于基准测试反洗钱方法的合成数据集。

A synthetic data set to benchmark anti-money laundering methods.

机构信息

Department of Electrical and Computer Engineering, Aarhus University, Aarhus, 8200, Denmark.

Spar Nord Bank, 9100, Aalborg, Denmark.

出版信息

Sci Data. 2023 Sep 28;10(1):661. doi: 10.1038/s41597-023-02569-2.

DOI:10.1038/s41597-023-02569-2
PMID:37770445
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10539331/
Abstract

Bank transactions are highly confidential. As a result, there are no real public data sets that can be used to investigate and compare anti-money laundering (AML) methods in banks. This severely limits research on important AML problems such as efficiency, effectiveness, class imbalance, concept drift, and interpretability. To address the issue, we present SynthAML: a synthetic data set to benchmark statistical and machine learning methods for AML. The data set builds on real data from Spar Nord, a systemically important Danish bank, and contains 20,000 AML alerts and over 16 million transactions. Experimental results indicate that performance on SynthAML can be transferred to the real world. As use cases, we present and discuss open problems in the AML literature.

摘要

银行交易高度保密。因此,没有真正的公共数据集可用于调查和比较银行的反洗钱 (AML) 方法。这严重限制了对 AML 领域的一些重要问题的研究,例如效率、有效性、类不平衡、概念漂移和可解释性。为了解决这个问题,我们提出了 SynthAML:一个用于基准统计和机器学习方法的 AML 合成数据集。该数据集基于来自 Spar Nord(一家具有系统重要性的丹麦银行)的真实数据,包含 20,000 个 AML 警报和超过 1600 万笔交易。实验结果表明,SynthAML 上的性能可以转移到现实世界中。作为用例,我们提出并讨论了 AML 文献中的开放性问题。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf8/10539331/05e45e1f9a6d/41597_2023_2569_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf8/10539331/6151b477d453/41597_2023_2569_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf8/10539331/331471e6c777/41597_2023_2569_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf8/10539331/2598b0b0117e/41597_2023_2569_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf8/10539331/5d658b72cf80/41597_2023_2569_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf8/10539331/6763f745ac14/41597_2023_2569_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf8/10539331/628636131f7e/41597_2023_2569_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf8/10539331/03f689d91a41/41597_2023_2569_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf8/10539331/05e45e1f9a6d/41597_2023_2569_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf8/10539331/6151b477d453/41597_2023_2569_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf8/10539331/331471e6c777/41597_2023_2569_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf8/10539331/2598b0b0117e/41597_2023_2569_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf8/10539331/5d658b72cf80/41597_2023_2569_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf8/10539331/6763f745ac14/41597_2023_2569_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf8/10539331/628636131f7e/41597_2023_2569_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf8/10539331/03f689d91a41/41597_2023_2569_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf8/10539331/05e45e1f9a6d/41597_2023_2569_Fig8_HTML.jpg

相似文献

1
A synthetic data set to benchmark anti-money laundering methods.用于基准测试反洗钱方法的合成数据集。
Sci Data. 2023 Sep 28;10(1):661. doi: 10.1038/s41597-023-02569-2.
2
The regulatory technology "RegTech" and money laundering prevention in Islamic and conventional banking industry.监管科技(RegTech)与伊斯兰银行业和传统银行业的洗钱预防
Heliyon. 2020 Oct 8;6(10):e04949. doi: 10.1016/j.heliyon.2020.e04949. eCollection 2020 Oct.
3
Self-Organising Map Based Framework for Investigating Accounts Suspected of Money Laundering.基于自组织映射的洗钱可疑账户调查框架
Front Artif Intell. 2021 Dec 14;4:761925. doi: 10.3389/frai.2021.761925. eCollection 2021.
4
Implementation of organization and end-user computing-anti-money laundering monitoring and analysis system security control.组织和终端用户计算-反洗钱监控和分析系统安全控制的实施。
PLoS One. 2021 Dec 9;16(12):e0258627. doi: 10.1371/journal.pone.0258627. eCollection 2021.
5
Combining Benford's Law and machine learning to detect money laundering. An actual Spanish court case.结合本福特定律与机器学习来检测洗钱行为。一个真实的西班牙法庭案例。
Forensic Sci Int. 2018 Jan;282:24-34. doi: 10.1016/j.forsciint.2017.11.008. Epub 2017 Nov 11.
6
Leveraging machine learning in the global fight against money laundering and terrorism financing: An affordances perspective.在全球打击洗钱和恐怖主义融资斗争中利用机器学习:一种可供性视角。
J Bus Res. 2021 Jul;131:441-452. doi: 10.1016/j.jbusres.2020.10.012. Epub 2020 Oct 17.
7
Estimating money laundering flows with a gravity model-based simulation.基于引力模型模拟估算洗钱流量。
Sci Rep. 2020 Oct 29;10(1):18552. doi: 10.1038/s41598-020-75653-x.
8
Is money laundering a true problem in China?在中国,洗钱是一个切实存在的问题吗?
Int J Offender Ther Comp Criminol. 2006 Feb;50(1):101-16. doi: 10.1177/0306624X05277663.
9
Identifying highly-valued bank customers with current accounts based on the frequency and amount of transactions.根据交易频率和金额识别拥有活期账户的高价值银行客户。
Heliyon. 2024 Jun 28;10(13):e33490. doi: 10.1016/j.heliyon.2024.e33490. eCollection 2024 Jul 15.
10
Evaluating the Control of Money Laundering and Its Underlying Offences: the Search for Meaningful Data.评估洗钱及其相关犯罪的管控:寻求有意义的数据。
Asian J Criminol. 2020;15(4):301-320. doi: 10.1007/s11417-020-09319-y. Epub 2020 May 20.

引用本文的文献

1
SynthSoM: A synthetic intelligent multi-modal sensing-communication dataset for Synesthesia of Machines (SoM).合成机器联觉多模态传感通信数据集(SynthSoM):用于机器联觉(SoM)的合成智能多模态传感通信数据集。
Sci Data. 2025 May 20;12(1):819. doi: 10.1038/s41597-025-05065-x.

本文引用的文献

1
Re-identification of individuals in genomic datasets using public face images.利用公开面部图像对基因组数据集中的个体进行重新识别。
Sci Adv. 2021 Nov 19;7(47):eabg3296. doi: 10.1126/sciadv.abg3296. Epub 2021 Nov 17.
2
Definitions, methods, and applications in interpretable machine learning.可解释机器学习中的定义、方法和应用。
Proc Natl Acad Sci U S A. 2019 Oct 29;116(44):22071-22080. doi: 10.1073/pnas.1900654116. Epub 2019 Oct 16.
3
Estimating the success of re-identifications in incomplete datasets using generative models.
利用生成模型估计不完全数据集重识别的成功率。
Nat Commun. 2019 Jul 23;10(1):3069. doi: 10.1038/s41467-019-10933-3.
4
On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation.关于通过逐层相关性传播对非线性分类器决策进行逐像素解释
PLoS One. 2015 Jul 10;10(7):e0130140. doi: 10.1371/journal.pone.0130140. eCollection 2015.
5
Unique in the Crowd: The privacy bounds of human mobility.独一无二的人群:人类流动的隐私边界。
Sci Rep. 2013;3:1376. doi: 10.1038/srep01376.