• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

人类观察与深度学习模型之间的不一致性:评估死后计算机断层扫描诊断溺死的有效性。

Inconsistency between Human Observation and Deep Learning Models: Assessing Validity of Postmortem Computed Tomography Diagnosis of Drowning.

机构信息

Department of Radiological Imaging and Informatics, Tohoku University Graduate School of Medicine, Sendai, Japan.

National Institute of Technology, Sendai College, Sendai, Japan.

出版信息

J Imaging Inform Med. 2024 Jun;37(3):1-10. doi: 10.1007/s10278-024-00974-6. Epub 2024 Feb 9.

DOI:10.1007/s10278-024-00974-6
PMID:38336949
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11169324/
Abstract

Drowning diagnosis is a complicated process in the autopsy, even with the assistance of autopsy imaging and the on-site information from where the body was found. Previous studies have developed well-performed deep learning (DL) models for drowning diagnosis. However, the validity of the DL models was not assessed, raising doubts about whether the learned features accurately represented the medical findings observed by human experts. In this paper, we assessed the medical validity of DL models that had achieved high classification performance for drowning diagnosis. This retrospective study included autopsy cases aged 8-91 years who underwent postmortem computed tomography between 2012 and 2021 (153 drowning and 160 non-drowning cases). We first trained three deep learning models from a previous work and generated saliency maps that highlight important features in the input. To assess the validity of models, pixel-level annotations were created by four radiological technologists and further quantitatively compared with the saliency maps. All the three models demonstrated high classification performance with areas under the receiver operating characteristic curves of 0.94, 0.97, and 0.98, respectively. On the other hand, the assessment results revealed unexpected inconsistency between annotations and models' saliency maps. In fact, each model had, respectively, around 30%, 40%, and 80% of irrelevant areas in the saliency maps, suggesting the predictions of the DL models might be unreliable. The result alerts us in the careful assessment of DL tools, even those with high classification performance.

摘要

溺死的诊断在尸检中是一个复杂的过程,即使有尸检成像和发现尸体的现场信息的协助也是如此。先前的研究已经开发出性能良好的深度学习(DL)模型用于溺死的诊断。然而,这些 DL 模型的有效性尚未得到评估,这让人怀疑所学习到的特征是否准确地代表了人类专家观察到的医学发现。在本文中,我们评估了那些在溺死诊断中取得了高分类性能的 DL 模型的医学有效性。这项回顾性研究纳入了 2012 年至 2021 年间进行死后计算机断层扫描的年龄为 8-91 岁的尸检病例(153 例溺死和 160 例非溺死病例)。我们首先从之前的工作中训练了三个深度学习模型,并生成了突显图,突出输入中的重要特征。为了评估模型的有效性,由四位放射技术人员创建了像素级别的标注,并进一步与突显图进行定量比较。所有三个模型的分类性能均很高,受试者工作特征曲线下的面积分别为 0.94、0.97 和 0.98。另一方面,评估结果显示标注与模型的突显图之间存在出人意料的不一致。事实上,每个模型的突显图中分别有大约 30%、40%和 80%的不相关区域,这表明 DL 模型的预测可能不可靠。该结果提醒我们在对 DL 工具进行仔细评估时,即使是那些具有高分类性能的工具也需要谨慎。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf6/11169324/5ef3ff8bc606/10278_2024_974_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf6/11169324/15542a40625f/10278_2024_974_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf6/11169324/fb72f987660a/10278_2024_974_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf6/11169324/e7dd9b9b0325/10278_2024_974_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf6/11169324/6900d281e623/10278_2024_974_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf6/11169324/6e35e4c8ff61/10278_2024_974_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf6/11169324/541d9fb1311b/10278_2024_974_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf6/11169324/5ef3ff8bc606/10278_2024_974_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf6/11169324/15542a40625f/10278_2024_974_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf6/11169324/fb72f987660a/10278_2024_974_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf6/11169324/e7dd9b9b0325/10278_2024_974_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf6/11169324/6900d281e623/10278_2024_974_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf6/11169324/6e35e4c8ff61/10278_2024_974_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf6/11169324/541d9fb1311b/10278_2024_974_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf6/11169324/5ef3ff8bc606/10278_2024_974_Fig7_HTML.jpg

相似文献

1
Inconsistency between Human Observation and Deep Learning Models: Assessing Validity of Postmortem Computed Tomography Diagnosis of Drowning.人类观察与深度学习模型之间的不一致性:评估死后计算机断层扫描诊断溺死的有效性。
J Imaging Inform Med. 2024 Jun;37(3):1-10. doi: 10.1007/s10278-024-00974-6. Epub 2024 Feb 9.
2
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
3
Development and Validation of a Convolutional Neural Network Model to Predict a Pathologic Fracture in the Proximal Femur Using Abdomen and Pelvis CT Images of Patients With Advanced Cancer.利用晚期癌症患者腹部和骨盆 CT 图像建立卷积神经网络模型预测股骨近端病理性骨折的研究
Clin Orthop Relat Res. 2023 Nov 1;481(11):2247-2256. doi: 10.1097/CORR.0000000000002771. Epub 2023 Aug 23.
4
A 2.5D Deep Learning-Based Method for Drowning Diagnosis Using Post-Mortem Computed Tomography.基于 2.5D 深度学习的死后计算机断层扫描溺水诊断方法。
IEEE J Biomed Health Inform. 2023 Feb;27(2):1026-1035. doi: 10.1109/JBHI.2022.3225416. Epub 2023 Feb 3.
5
Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?当前的生存预测工具在治疗骨转移后的骨骼相关事件时有用吗?
Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.
6
Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.缺失数据的存在是否会影响 SORG 机器学习算法在脊柱转移瘤患者中的性能?开发一种互联网应用算法。
Clin Orthop Relat Res. 2024 Jan 1;482(1):143-157. doi: 10.1097/CORR.0000000000002706. Epub 2023 Jun 12.
7
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
8
Characterization of fluid in facial sinuses on post-mortem CT in case of death by drowning.溺水死亡案例中尸检CT上面部鼻窦内液体的特征分析
Int J Legal Med. 2025 Sep;139(5):2233-2240. doi: 10.1007/s00414-025-03493-3. Epub 2025 Apr 11.
9
The significance of evaluating sphenoid sinus fluid by postmortem computed tomography in cases of drowning.评估溺死病例中蝶窦积液的死后计算机断层扫描的意义。
J Forensic Leg Med. 2023 Jul;97:102551. doi: 10.1016/j.jflm.2023.102551. Epub 2023 Jun 15.
10
Sertindole for schizophrenia.用于治疗精神分裂症的舍吲哚。
Cochrane Database Syst Rev. 2005 Jul 20;2005(3):CD001715. doi: 10.1002/14651858.CD001715.pub2.

引用本文的文献

1
Diagnostic Performance of GPT-4o and Claude 3 Opus in Determining Causes of Death From Medical Histories and Postmortem CT Findings.GPT-4o和Claude 3 Opus根据病史和尸检CT结果确定死因的诊断性能
Cureus. 2024 Aug 20;16(8):e67306. doi: 10.7759/cureus.67306. eCollection 2024 Aug.

本文引用的文献

1
Automated Urine Cell Image Classification Model Using Chaotic Mixer Deep Feature Extraction.基于混沌混合器深度特征提取的尿液细胞图像自动分类模型。
J Digit Imaging. 2023 Aug;36(4):1675-1686. doi: 10.1007/s10278-023-00827-8. Epub 2023 May 2.
2
A 2.5D Deep Learning-Based Method for Drowning Diagnosis Using Post-Mortem Computed Tomography.基于 2.5D 深度学习的死后计算机断层扫描溺水诊断方法。
IEEE J Biomed Health Inform. 2023 Feb;27(2):1026-1035. doi: 10.1109/JBHI.2022.3225416. Epub 2023 Feb 3.
3
Diagnosing Drowning in Postmortem CT Images Using Artificial Intelligence.
利用人工智能在尸检CT图像中诊断溺水情况。
Tohoku J Exp Med. 2022 Dec 14;259(1):65-75. doi: 10.1620/tjem.2022.J097. Epub 2022 Nov 17.
4
NuCLS: A scalable crowdsourcing approach and dataset for nucleus classification and segmentation in breast cancer.NuCLS:一种用于乳腺癌细胞核分类和分割的可扩展众包方法和数据集。
Gigascience. 2022 May 17;11. doi: 10.1093/gigascience/giac037.
5
Assessing the Trustworthiness of Saliency Maps for Localizing Abnormalities in Medical Imaging.评估用于医学影像中异常定位的显著性图的可信度。
Radiol Artif Intell. 2021 Oct 6;3(6):e200267. doi: 10.1148/ryai.2021200267. eCollection 2021 Nov.
6
Explainable Deep Learning Models in Medical Image Analysis.医学图像分析中的可解释深度学习模型
J Imaging. 2020 Jun 20;6(6):52. doi: 10.3390/jimaging6060052.
7
External Validation of Deep Learning Algorithm for Detecting and Visualizing Femoral Neck Fracture Including Displaced and Non-displaced Fracture on Plain X-ray.基于 X 射线平片的深度学习算法检测和可视化股骨颈骨折(包括移位和无移位骨折)的外部验证。
J Digit Imaging. 2021 Oct;34(5):1099-1109. doi: 10.1007/s10278-021-00499-2. Epub 2021 Aug 11.
8
Validating deep learning inference during chest X-ray classification for COVID-19 screening.验证深度学习推理在 COVID-19 筛查中的胸部 X 射线分类。
Sci Rep. 2021 Aug 9;11(1):16075. doi: 10.1038/s41598-021-95561-y.
9
Interobserver agreement: A preliminary investigation into how much is enough?观察者间一致性:对“多少才算足够”的初步调查
J Appl Behav Anal. 2022 Mar;55(2):357-368. doi: 10.1002/jaba.811. Epub 2021 Jan 22.
10
A Deep Learning Aided Drowning Diagnosis for Forensic Investigations using Post-Mortem Lung CT Images.一种利用死后肺部CT图像进行法医学调查的深度学习辅助溺水诊断方法。
Annu Int Conf IEEE Eng Med Biol Soc. 2020 Jul;2020:1262-1265. doi: 10.1109/EMBC44109.2020.9175731.