Suppr超能文献

极性标记:用于训练疾病分类器的银标准算法。

Polar labeling: silver standard algorithm for training disease classifiers.

机构信息

Laboratory of Computer Science, Massachusetts General Hospital, Boston, MA 02114, USA.

Partners Healthcare, Somerville, MA 02145, USA.

出版信息

Bioinformatics. 2020 May 1;36(10):3200-3206. doi: 10.1093/bioinformatics/btaa088.

Abstract

MOTIVATION

Expert-labeled data are essential to train phenotyping algorithms for cohort identification. However expert labeling is time and labor intensive, and the costs remain prohibitive for scaling phenotyping to wider use-cases.

RESULTS

We present an approach referred to as polar labeling (PL), to create silver standard for training machine learning (ML) for disease classification. We test the hypothesis that ML models trained on the silver standard created by applying PL on unlabeled patient records, are comparable in performance to the ML models trained on gold standard, created by clinical experts through manual review of patient records. We perform experimental validation using health records of 38 023 patients spanning six diseases. Our results demonstrate the superior performance of the proposed approach.

AVAILABILITY AND IMPLEMENTATION

We provide a Python implementation of the algorithm and the Python code developed for this study on Github.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

专家标记的数据对于训练用于队列识别的表型算法至关重要。然而,专家标记既费时又费力,对于将表型分析扩展到更广泛的用例而言,其成本仍然过高。

结果

我们提出了一种称为极性标记(Polar Labeling,PL)的方法,用于创建用于疾病分类的机器学习(ML)训练的银标准。我们检验了一个假设,即通过对未标记的患者记录应用 PL 来训练的 ML 模型,其性能与通过对患者记录进行手动审查由临床专家创建的金标准训练的 ML 模型相当。我们使用跨越六种疾病的 38023 名患者的健康记录进行实验验证。我们的结果表明了所提出方法的优越性能。

可用性和实施

我们在 Github 上提供了该算法的 Python 实现以及为这项研究开发的 Python 代码。

补充信息

补充数据可在《生物信息学》在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f223/7214041/2192c7c4867c/btaa088f2.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验