Oronowicz-Jaśkowiak Wojciech, Wasilewski Piotr
Faculty of Information Technology, Polish-Japanese Academy of Information Technology, Warsaw, Poland.
Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Poland.
Postep Psychiatr Neurol. 2022 Dec;31(4):161-166. doi: 10.5114/ppn.2022.124356. Epub 2023 Jan 20.
Neural networks might be an appropriate solution for the categorization of child sexual abuse materials (CSAM) in forensic cases. The aim of this study was to present a neural network model that may be able to categorize objects and behaviors, which are visible in CSAM, using pictures visually similar to CSAM (AB/DL), involving persons who have paraphilic preferences for watching adult women or men dressed like children or involved in activities typical for children, such as playing.
The dataset consisted of 2251 photos divided into five classes. 1914 photos were randomly used for the training of the neural network, while 337 photos were used for its later validation. The Fast.ai and PyTorch libraries were used for the training of the neural network using the ResNet152 model. We used five classes, two of which were imported from the sexACT dataset, and three of which that were collected for this study.
The model was able to classify selected classes with a relatively high accuracy (95%); on the other hand, further improvement of the network is needed, considering the fact that the final validation loss was moderate (0.17).
The model presented might be effective in the classification of several objects and behaviors presented in a number of pornography categories ("paraphilic infantilism", "sexual activity", "nude women", "dressed women", "sexual activity - spanking"). As the results are promising, further research on real CSAM is justified.
神经网络可能是法医案件中对儿童性虐待材料(CSAM)进行分类的合适解决方案。本研究的目的是提出一种神经网络模型,该模型或许能够使用与CSAM视觉上相似的图片(AB/DL),对CSAM中可见的物体和行为进行分类,这些图片涉及对观看穿着儿童服装或参与儿童典型活动(如玩耍)的成年女性或男性有恋物癖偏好的人。
数据集由2251张照片组成,分为五类。1914张照片被随机用于神经网络的训练,而337张照片用于随后的验证。使用Fast.ai和PyTorch库,利用ResNet152模型对神经网络进行训练。我们使用了五类,其中两类从sexACT数据集中导入,另外三类是为本研究收集的。
该模型能够以相对较高的准确率(95%)对选定类别进行分类;另一方面,考虑到最终验证损失适中(0.17),网络需要进一步改进。
所提出的模型可能对一些色情类别(“恋童癖幼稚症”、“性活动”、“裸体女性”、“着装女性”、“性活动 - 打屁股”)中呈现的若干物体和行为的分类有效。由于结果很有前景,对真实CSAM进行进一步研究是合理的。