Suppr超能文献

极性标记:用于训练疾病分类器的银标准算法。

Polar labeling: silver standard algorithm for training disease classifiers.

机构信息

Laboratory of Computer Science, Massachusetts General Hospital, Boston, MA 02114, USA.

Partners Healthcare, Somerville, MA 02145, USA.

出版信息

Bioinformatics. 2020 May 1;36(10):3200-3206. doi: 10.1093/bioinformatics/btaa088.

Abstract

MOTIVATION

Expert-labeled data are essential to train phenotyping algorithms for cohort identification. However expert labeling is time and labor intensive, and the costs remain prohibitive for scaling phenotyping to wider use-cases.

RESULTS

We present an approach referred to as polar labeling (PL), to create silver standard for training machine learning (ML) for disease classification. We test the hypothesis that ML models trained on the silver standard created by applying PL on unlabeled patient records, are comparable in performance to the ML models trained on gold standard, created by clinical experts through manual review of patient records. We perform experimental validation using health records of 38 023 patients spanning six diseases. Our results demonstrate the superior performance of the proposed approach.

AVAILABILITY AND IMPLEMENTATION

We provide a Python implementation of the algorithm and the Python code developed for this study on Github.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

专家标记的数据对于训练用于队列识别的表型算法至关重要。然而,专家标记既费时又费力,对于将表型分析扩展到更广泛的用例而言,其成本仍然过高。

结果

我们提出了一种称为极性标记(Polar Labeling,PL)的方法,用于创建用于疾病分类的机器学习(ML)训练的银标准。我们检验了一个假设,即通过对未标记的患者记录应用 PL 来训练的 ML 模型,其性能与通过对患者记录进行手动审查由临床专家创建的金标准训练的 ML 模型相当。我们使用跨越六种疾病的 38023 名患者的健康记录进行实验验证。我们的结果表明了所提出方法的优越性能。

可用性和实施

我们在 Github 上提供了该算法的 Python 实现以及为这项研究开发的 Python 代码。

补充信息

补充数据可在《生物信息学》在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f223/7214041/2192c7c4867c/btaa088f2.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验