School of Statistics and Mathematics, Zhongnan University of Economics and Law, Wuhan, Hubei 430073, China.
Graduate School of Public Health, St Luke's International University, OMURA Susumu & Mieko Memorial St Luke's Center for Clinical Academia, Chuo-ku, Tokyo 104-0045, Japan.
J Am Med Inform Assoc. 2024 Nov 1;31(11):2632-2640. doi: 10.1093/jamia/ocae197.
Active learning (AL) has rarely integrated diversity-based and uncertainty-based strategies into a dynamic sampling framework for clinical named entity recognition (NER). Machine-assisted annotation is becoming popular for creating gold-standard labels. This study investigated the effectiveness of dynamic AL strategies under simulated machine-assisted annotation scenarios for clinical NER.
We proposed 3 new AL strategies: a diversity-based strategy (CLUSTER) based on Sentence-BERT and 2 dynamic strategies (CLC and CNBSE) capable of switching from diversity-based to uncertainty-based strategies. Using BioClinicalBERT as the foundational NER model, we conducted simulation experiments on 3 medication-related clinical NER datasets independently: i2b2 2009, n2c2 2018 (Track 2), and MADE 1.0. We compared the proposed strategies with uncertainty-based (LC and NBSE) and passive-learning (RANDOM) strategies. Performance was primarily measured by the number of edits made by the annotators to achieve a desired target effectiveness evaluated on independent test sets.
When aiming for 98% overall target effectiveness, on average, CLUSTER required the fewest edits. When aiming for 99% overall target effectiveness, CNBSE required 20.4% fewer edits than NBSE did. CLUSTER and RANDOM could not achieve such a high target under the pool-based simulation experiment. For high-difficulty entities, CNBSE required 22.5% fewer edits than NBSE to achieve 99% target effectiveness, whereas neither CLUSTER nor RANDOM achieved 93% target effectiveness.
When the target effectiveness was set high, the proposed dynamic strategy CNBSE exhibited both strong learning capabilities and low annotation costs in machine-assisted annotation. CLUSTER required the fewest edits when the target effectiveness was set low.
主动学习(AL)很少将基于多样性和基于不确定性的策略集成到临床命名实体识别(NER)的动态采样框架中。机器辅助标注正成为创建黄金标准标签的流行方法。本研究调查了在模拟机器辅助标注场景下,针对临床 NER 的动态 AL 策略的有效性。
我们提出了 3 种新的 AL 策略:一种基于 Sentence-BERT 的基于多样性的策略(CLUSTER)和 2 种能够从基于多样性切换到基于不确定性的策略的动态策略(CLC 和 CNBSE)。我们使用 BioClinicalBERT 作为基础 NER 模型,在 3 个独立的药物相关临床 NER 数据集上进行了模拟实验:i2b2 2009、n2c2 2018(第 2 轨道)和 MADE 1.0。我们将所提出的策略与基于不确定性的(LC 和 NBSE)和被动学习的(RANDOM)策略进行了比较。性能主要通过标注者为达到在独立测试集上评估的目标效果所需的编辑次数来衡量。
当目标整体效果达到 98%时,CLUSTER 平均需要的编辑次数最少。当目标整体效果达到 99%时,CNBSE 比 NBSE 少需要 20.4%的编辑次数。CLUSTER 和 RANDOM 在基于池的模拟实验中无法达到如此高的目标。对于高难度实体,CNBSE 比 NBSE 少需要 22.5%的编辑次数即可达到 99%的目标效果,而 CLUSTER 和 RANDOM 都没有达到 93%的目标效果。
当目标效果设置较高时,所提出的动态策略 CNBSE 在机器辅助标注中表现出较强的学习能力和较低的标注成本。当目标效果设置较低时,CLUSTER 需要的编辑次数最少。