• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Cost-sensitive Active Learning for Phenotyping of Electronic Health Records.用于电子健康记录表型分析的成本敏感主动学习
AMIA Jt Summits Transl Sci Proc. 2019 May 6;2019:829-838. eCollection 2019.
2
Applying active learning to high-throughput phenotyping algorithms for electronic health records data.将主动学习应用于电子健康记录数据的高通量表型算法。
J Am Med Inform Assoc. 2013 Dec;20(e2):e253-9. doi: 10.1136/amiajnl-2013-001945. Epub 2013 Jul 13.
3
Cost-aware active learning for named entity recognition in clinical text.基于成本意识的临床文本命名实体识别的主动学习。
J Am Med Inform Assoc. 2019 Nov 1;26(11):1314-1322. doi: 10.1093/jamia/ocz102.
4
A study of active learning methods for named entity recognition in clinical text.临床文本中命名实体识别的主动学习方法研究
J Biomed Inform. 2015 Dec;58:11-18. doi: 10.1016/j.jbi.2015.09.010. Epub 2015 Sep 15.
5
Inter-labeler and intra-labeler variability of condition severity classification models using active and passive learning methods.采用主动学习和被动学习方法的条件严重程度分类模型的标签间和标签内变异性。
Artif Intell Med. 2017 Sep;81:12-32. doi: 10.1016/j.artmed.2017.03.003. Epub 2017 Apr 27.
6
Applying active learning to supervised word sense disambiguation in MEDLINE.将主动学习应用于 MEDLINE 中的监督词义消歧。
J Am Med Inform Assoc. 2013 Sep-Oct;20(5):1001-6. doi: 10.1136/amiajnl-2012-001244. Epub 2013 Jan 30.
7
Active learning reduces annotation time for clinical concept extraction.主动学习减少了临床概念提取的标注时间。
Int J Med Inform. 2017 Oct;106:25-31. doi: 10.1016/j.ijmedinf.2017.08.001. Epub 2017 Aug 5.
8
Prediction of myopia development among Chinese school-aged children using refraction data from electronic medical records: A retrospective, multicentre machine learning study.基于电子病历中的屈光数据预测中国学龄儿童近视进展:一项回顾性、多中心机器学习研究。
PLoS Med. 2018 Nov 6;15(11):e1002674. doi: 10.1371/journal.pmed.1002674. eCollection 2018 Nov.
9
Applying active learning to assertion classification of concepts in clinical text.将主动学习应用于临床文本中概念的断言分类。
J Biomed Inform. 2012 Apr;45(2):265-72. doi: 10.1016/j.jbi.2011.11.003. Epub 2011 Nov 22.
10
Automated feature selection of predictors in electronic medical records data.电子病历数据中预测指标的自动特征选择
Biometrics. 2019 Mar;75(1):268-277. doi: 10.1111/biom.12987. Epub 2019 Apr 2.

引用本文的文献

1
Active deep learning for the identification of concepts and relations in electroencephalography reports.主动深度学习在脑电图报告中概念和关系的识别。
J Biomed Inform. 2019 Oct;98:103265. doi: 10.1016/j.jbi.2019.103265. Epub 2019 Aug 27.

本文引用的文献

1
An active learning-enabled annotation system for clinical named entity recognition.基于主动学习的临床命名实体识别标注系统。
BMC Med Inform Decis Mak. 2017 Jul 5;17(Suppl 2):82. doi: 10.1186/s12911-017-0466-9.
2
PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability.PheKB:一个用于创建可移植电子表型算法的目录和工作流程。
J Am Med Inform Assoc. 2016 Nov;23(6):1046-1052. doi: 10.1093/jamia/ocv202. Epub 2016 Mar 28.
3
Expert guided natural language processing using one-class classification.使用单类分类的专家指导自然语言处理。
J Am Med Inform Assoc. 2015 Sep;22(5):962-6. doi: 10.1093/jamia/ocv010. Epub 2015 Jun 10.
4
Applying active learning to high-throughput phenotyping algorithms for electronic health records data.将主动学习应用于电子健康记录数据的高通量表型算法。
J Am Med Inform Assoc. 2013 Dec;20(e2):e253-9. doi: 10.1136/amiajnl-2013-001945. Epub 2013 Jul 13.
5
Improving case definition of Crohn's disease and ulcerative colitis in electronic medical records using natural language processing: a novel informatics approach.利用自然语言处理改善电子病历中克罗恩病和溃疡性结肠炎的病例定义:一种新的信息学方法。
Inflamm Bowel Dis. 2013 Jun;19(7):1411-20. doi: 10.1097/MIB.0b013e31828133fd.
6
Learning to predict post-hospitalization VTE risk from EHR data.学习从电子健康记录(EHR)数据预测出院后静脉血栓栓塞(VTE)风险。
AMIA Annu Symp Proc. 2012;2012:436-45. Epub 2012 Nov 3.
7
Next-generation phenotyping of electronic health records.电子健康记录的下一代表型分析。
J Am Med Inform Assoc. 2013 Jan 1;20(1):117-21. doi: 10.1136/amiajnl-2012-001145. Epub 2012 Sep 6.
8
Portability of an algorithm to identify rheumatoid arthritis in electronic health records.算法在电子健康记录中识别类风湿关节炎的可移植性。
J Am Med Inform Assoc. 2012 Jun;19(e1):e162-9. doi: 10.1136/amiajnl-2011-000583. Epub 2012 Feb 28.
9
Extracting and integrating data from entire electronic health records for detecting colorectal cancer cases.从完整的电子健康记录中提取和整合数据以检测结直肠癌病例。
AMIA Annu Symp Proc. 2011;2011:1564-72. Epub 2011 Oct 22.
10
Naïve Electronic Health Record phenotype identification for Rheumatoid arthritis.类风湿关节炎的单纯电子健康记录表型识别
AMIA Annu Symp Proc. 2011;2011:189-96. Epub 2011 Oct 22.

用于电子健康记录表型分析的成本敏感主动学习

Cost-sensitive Active Learning for Phenotyping of Electronic Health Records.

作者信息

Ji Zongcheng, Wei Qiang, Franklin Amy, Cohen Trevor, Xu Hua

机构信息

School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.

Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, USA.

出版信息

AMIA Jt Summits Transl Sci Proc. 2019 May 6;2019:829-838. eCollection 2019.

PMID:31259040
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6568101/
Abstract

Developing high-throughput and high-performance phenotyping algorithms is critical to the secondary use of electronic health records for clinical research. Supervised machine learning-based methods have shown good performance, but often require large annotated datasets that are costly to build. Simulation studies have shown that active learning (AL) could reduce the number of annotated samples while improving the model performance when assuming that the time of labeling each sample is the same (i.e., cost-insensitive). In this study, we proposed a cost- sensitive AL (CostAL) algorithm for clinical phenotyping, using the identification of breast cancer patients as a use case. CostAL implements a linear regression model to estimate the actual time required for annotating each individual sample. We recruited two annotators to manual review medical records of 766 potential breast cancer patients and recorded the actual time of annotating each sample. We then compared CostAL, AL, and passive learning (PL, aka random sampling) using this annotated dataset and generated learning curves for each method. Our experimental results showed that CostAL achieved the highest area under the curve (AUC) score among the three algorithms (PL, AL, and CostAL are 0.784, 0.8501, and 0.8673 for user 1 and 0.8006, 0.8806 and 0.9006 for user 2). To achieve an accuracy of 0.94, AL and CostAL could save 36% and 60% annotation time for user 1 and 53% and 70% annotation time for user 2, when they were compared with PL, indicating the value of cost-sensitive AL approaches.

摘要

开发高通量和高性能的表型分析算法对于电子健康记录在临床研究中的二次利用至关重要。基于监督式机器学习的方法已显示出良好的性能,但通常需要构建成本高昂的大型注释数据集。模拟研究表明,在假设标记每个样本的时间相同(即成本不敏感)的情况下,主动学习(AL)可以减少注释样本的数量,同时提高模型性能。在本研究中,我们以乳腺癌患者的识别为例,提出了一种用于临床表型分析的成本敏感主动学习(CostAL)算法。CostAL实现了一个线性回归模型来估计注释每个单独样本所需的实际时间。我们招募了两名注释人员手动审查766名潜在乳腺癌患者的病历,并记录注释每个样本的实际时间。然后,我们使用这个注释数据集比较了CostAL、AL和被动学习(PL,即随机抽样),并为每种方法生成了学习曲线。我们的实验结果表明,CostAL在三种算法中实现了最高的曲线下面积(AUC)分数(对于用户1,PL、AL和CostAL分别为0.784、0.8501和0.8673;对于用户2,分别为0.8006、0.8806和0.9006)。与PL相比,为了达到0.94的准确率,对于用户1,AL和CostAL可以节省36%和60%的注释时间,对于用户2,可以节省53%和70%的注释时间,这表明了成本敏感主动学习方法的价值。