Suppr超能文献

从多机构队列中为病理学家注释数据集确定病例优先级。

Prioritizing cases from a multi-institutional cohort for a dataset of pathologist annotations.

作者信息

Garcia Victor, Gardecki Emma, Jou Stephanie, Li Xiaoxian, Shroyer Kenneth R, Saltz Joel, Acs Balazs, Elfer Katherine, Lennerz Jochen, Salgado Roberto, Gallas Brandon D

机构信息

U.S. Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging, Diagnostics, and Software Reliability, Silver Spring, MD, United States of America.

Department of Pathology and Laboratory Medicine, Emory University, Atlanta, GA, United States of America.

出版信息

J Pathol Inform. 2024 Nov 16;16:100411. doi: 10.1016/j.jpi.2024.100411. eCollection 2025 Jan.

Abstract

OBJECTIVE

With the increasing energy surrounding the development of artificial intelligence and machine learning (AI/ML) models, the use of the same external validation dataset by various developers allows for a direct comparison of model performance. Through our High Throughput Truthing project, we are creating a validation dataset for AI/ML models trained in the assessment of stromal tumor-infiltrating lymphocytes (sTILs) in triple negative breast cancer (TNBC).

MATERIALS AND METHODS

We obtained clinical metadata for hematoxylin and eosin-stained glass slides and corresponding scanned whole slide images (WSIs) of TNBC core biopsies from two US academic medical centers. We selected regions of interest (ROIs) from the WSIs to target regions with various tissue morphologies and sTILs densities. Given the selected ROIs, we implemented a hierarchical rank-sort method for case prioritization.

RESULTS

We received 122 glass slides and clinical metadata on 105 unique patients with TNBC. All received cases were female, and the mean age was 63.44 years. 60% of all cases were White patients, and 38.1% were Black or African American. After case prioritization, the skewness of the sTILs density distribution improved from 0.60 to 0.46 with a corresponding increase in the entropy of the sTILs density bins from 1.20 to 1.24. We retained cases with less prevalent metadata elements.

CONCLUSION

This method allows us to prioritize underrepresented subgroups based on important clinical factors. In this manuscript, we discuss how we sourced the clinical metadata, selected ROIs, and developed our approach to prioritizing cases for inclusion in our pivotal study.

摘要

目的

随着围绕人工智能和机器学习(AI/ML)模型开发的热度不断上升,不同开发者使用相同的外部验证数据集能够直接比较模型性能。通过我们的高通量真值标注项目,我们正在创建一个用于在三阴性乳腺癌(TNBC)基质肿瘤浸润淋巴细胞(sTILs)评估中训练的AI/ML模型的验证数据集。

材料与方法

我们从两个美国学术医疗中心获取了苏木精和伊红染色玻璃幻灯片的临床元数据以及TNBC核心活检对应的全切片扫描图像(WSIs)。我们从WSIs中选择感兴趣区域(ROIs),以针对具有不同组织形态和sTILs密度的区域。鉴于所选的ROIs,我们实施了一种分层排序方法来对病例进行优先级排序。

结果

我们收到了122张玻璃幻灯片和105例TNBC独特患者的临床元数据。所有收到的病例均为女性,平均年龄为63.44岁。所有病例中有60%为白人患者,38.1%为黑人或非裔美国人。在病例优先级排序后,sTILs密度分布的偏度从0.60改善到0.46,sTILs密度区间的熵相应地从1.20增加到1.24。我们保留了具有较少常见元数据元素的病例。

结论

该方法使我们能够根据重要的临床因素对代表性不足的亚组进行优先级排序。在本手稿中,我们讨论了我们如何获取临床元数据、选择ROIs以及开发我们的方法来对纳入关键研究的病例进行优先级排序。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a85/11667696/85b2c8244c3b/gr1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验