Huang Zhi, Bianchi Federico, Yuksekgonul Mert, Montine Thomas J, Zou James
Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA.
Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA.
Nat Med. 2023 Sep;29(9):2307-2316. doi: 10.1038/s41591-023-02504-3. Epub 2023 Aug 17.
The lack of annotated publicly available medical images is a major barrier for computational research and education innovations. At the same time, many de-identified images and much knowledge are shared by clinicians on public forums such as medical Twitter. Here we harness these crowd platforms to curate OpenPath, a large dataset of 208,414 pathology images paired with natural language descriptions. We demonstrate the value of this resource by developing pathology language-image pretraining (PLIP), a multimodal artificial intelligence with both image and text understanding, which is trained on OpenPath. PLIP achieves state-of-the-art performances for classifying new pathology images across four external datasets: for zero-shot classification, PLIP achieves F1 scores of 0.565-0.832 compared to F1 scores of 0.030-0.481 for previous contrastive language-image pretrained model. Training a simple supervised classifier on top of PLIP embeddings also achieves 2.5% improvement in F1 scores compared to using other supervised model embeddings. Moreover, PLIP enables users to retrieve similar cases by either image or natural language search, greatly facilitating knowledge sharing. Our approach demonstrates that publicly shared medical information is a tremendous resource that can be harnessed to develop medical artificial intelligence for enhancing diagnosis, knowledge sharing and education.
缺乏带注释的公开可用医学图像是计算研究和教育创新的主要障碍。与此同时,许多去识别化的图像和知识在诸如医学推特等公共论坛上由临床医生分享。在这里,我们利用这些大众平台精心策划了OpenPath,这是一个包含208414张病理图像并配有自然语言描述的大型数据集。我们通过开发病理语言图像预训练(PLIP)来展示这一资源的价值,PLIP是一种具有图像和文本理解能力的多模态人工智能,它在OpenPath上进行训练。在四个外部数据集上对新的病理图像进行分类时,PLIP取得了当前最优的性能:对于零样本分类,PLIP的F1分数达到0.565 - 0.832,而之前的对比语言图像预训练模型的F1分数为0.030 - 0.481。在PLIP嵌入之上训练一个简单的监督分类器,与使用其他监督模型嵌入相比,F1分数也提高了2.5%。此外,PLIP使用户能够通过图像或自然语言搜索检索相似病例,极大地促进了知识共享。我们的方法表明,公开共享的医学信息是一种巨大的资源,可以用来开发医学人工智能,以加强诊断、知识共享和教育。