Okunev Michael, Handelman Doron, Handelman Amir
Faculty of Electrical Engineering, Holon Institute of Technology, Holon, Israel.
Sci Rep. 2025 Aug 11;15(1):29305. doi: 10.1038/s41598-025-15268-2.
Medicine is one of the most sensitive fields in which artificial intelligence (AI) is extensively used, spanning from medical image analysis to clinical support. Specifically, in medicine, where every decision may severely affect human lives, the issue of ensuring that AI systems operate ethically and produce results that align with ethical considerations is of great importance. In this work, we investigate the combination of several key parameters on the performance of artificial neural networks (ANNs) used for medical image analysis in the presence of data corruption or errors. For this purpose, we examined five different ANN architectures (AlexNet, LeNet 5, VGG16, ResNet-50, and Vision Transformers - ViT), and for each architecture, we checked its performance under varying combinations of training dataset sizes and percentages of images that are corrupted through mislabeling. The image mislabeling simulates deliberate or nondeliberate changes to the dataset, which may cause the AI system to produce unreliable results. We found that the five ANN architectures produce different results for the same task, both for cases with and without dataset modification, which implies that the selection of which ANN architecture to implement may have ethical aspects that need to be considered. We also found that label corruption resulted in a mixture of performance metrics tendencies, indicating that it is difficult to conclude whether label corruption has occurred. Our findings demonstrate the relation between ethics in AI and ANN architecture implementation and AI computational parameters used therefor, and raise awareness of the need to find appropriate ways to determine whether label corruption has occurred.
医学是人工智能(AI)广泛应用的最敏感领域之一,涵盖从医学图像分析到临床支持等各个方面。具体而言,在医学领域,每一个决策都可能严重影响人类生命,确保人工智能系统合乎道德地运行并产生符合伦理考量的结果这一问题至关重要。在这项工作中,我们研究了在存在数据损坏或错误的情况下,几个关键参数对用于医学图像分析的人工神经网络(ANN)性能的综合影响。为此,我们考察了五种不同的人工神经网络架构(AlexNet、LeNet 5、VGG16、ResNet - 50和视觉Transformer - ViT),并且对于每种架构,我们在不同组合的训练数据集大小以及因错误标注而损坏的图像百分比情况下检查其性能。图像错误标注模拟了对数据集的有意或无意更改,这可能导致人工智能系统产生不可靠的结果。我们发现,对于相同任务,无论数据集是否修改,这五种人工神经网络架构都会产生不同的结果,这意味着选择实施哪种人工神经网络架构可能存在需要考虑的伦理方面。我们还发现,标签损坏导致了性能指标趋势的混合,这表明很难断定是否发生了标签损坏。我们的研究结果证明了人工智能伦理与人工神经网络架构实施以及为此使用的人工智能计算参数之间的关系,并提高了人们对寻找合适方法来确定是否发生标签损坏的必要性的认识。