• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

医学决策支持中回归问题训练数据的偏差问题。

The problem of bias in training data in regression problems in medical decision support.

作者信息

Mac Namee B, Cunningham P, Byrne S, Corrigan O I

机构信息

Department of Computer Science, Trinity College, 2, Dublin, Ireland.

出版信息

Artif Intell Med. 2002 Jan;24(1):51-70. doi: 10.1016/s0933-3657(01)00092-6.

DOI:10.1016/s0933-3657(01)00092-6
PMID:11779685
Abstract

This paper describes a bias problem encountered in a machine learning approach to outcome prediction in anticoagulant drug therapy. The outcome to be predicted is a measure of the clotting time for the patient; this measure is continuous and so the prediction task is a regression problem. Artificial neural networks (ANNs) are a powerful mechanism for learning to predict such outcomes from training data. However, experiments have shown that an ANN is biased towards values more commonly occurring in the training data and is thus, less likely to be correct in predicting extreme values. This issue of bias in training data in regression problems is similar to the associated problem with minority classes in classification. However, this bias issue in classification is well documented and is an on-going area of research. In this paper, we consider stratified sampling and boosting as solutions to this bias problem and evaluate them on this outcome prediction problem and on two other datasets. Both approaches produce some improvements with boosting showing the most promise.

摘要

本文描述了在抗凝药物治疗结果预测的机器学习方法中遇到的一个偏差问题。要预测的结果是患者凝血时间的一种度量;该度量是连续的,因此预测任务是一个回归问题。人工神经网络(ANNs)是一种从训练数据中学习预测此类结果的强大机制。然而,实验表明,人工神经网络倾向于训练数据中更常见的值,因此在预测极端值时不太可能正确。回归问题中训练数据的偏差问题类似于分类中少数类别的相关问题。然而,分类中的这个偏差问题已有充分记录,并且是一个正在进行研究的领域。在本文中,我们考虑分层抽样和增强作为解决此偏差问题的方法,并在这个结果预测问题以及另外两个数据集上对它们进行评估。两种方法都产生了一些改进,其中增强显示出最有前景。

相似文献

1
The problem of bias in training data in regression problems in medical decision support.医学决策支持中回归问题训练数据的偏差问题。
Artif Intell Med. 2002 Jan;24(1):51-70. doi: 10.1016/s0933-3657(01)00092-6.
2
Biologically inspired intelligent decision making: a commentary on the use of artificial neural networks in bioinformatics.受生物启发的智能决策:关于人工神经网络在生物信息学中应用的评论
Bioengineered. 2014 Mar-Apr;5(2):80-95. doi: 10.4161/bioe.26997. Epub 2013 Dec 16.
3
The feature selection bias problem in relation to high-dimensional gene data.与高维基因数据相关的特征选择偏差问题。
Artif Intell Med. 2016 Jan;66:63-71. doi: 10.1016/j.artmed.2015.11.001. Epub 2015 Nov 14.
4
Intelligent quotient estimation of mental retarded people from different psychometric instruments using artificial neural networks.采用人工神经网络对不同心理计量器具测定的弱智智力商数的评估。
Artif Intell Med. 2012 Feb;54(2):135-45. doi: 10.1016/j.artmed.2011.11.002. Epub 2011 Dec 6.
5
Artificial neural networks in neurorehabilitation: A scoping review.人工神经网络在神经康复中的应用:范围综述。
NeuroRehabilitation. 2020;46(3):259-269. doi: 10.3233/NRE-192996.
6
Patient classification and outcome prediction in IgA nephropathy.IgA肾病的患者分类与预后预测
Comput Biol Med. 2015 Nov 1;66:278-86. doi: 10.1016/j.compbiomed.2015.09.003. Epub 2015 Sep 25.
7
The effect of data sampling on the performance evaluation of artificial neural networks in medical diagnosis.数据采样对医学诊断中人工神经网络性能评估的影响。
Med Decis Making. 1997 Apr-Jun;17(2):186-92. doi: 10.1177/0272989X9701700209.
8
Channel selection and classification of electroencephalogram signals: an artificial neural network and genetic algorithm-based approach.脑电信号的通道选择与分类:基于人工神经网络和遗传算法的方法。
Artif Intell Med. 2012 Jun;55(2):117-26. doi: 10.1016/j.artmed.2012.02.001. Epub 2012 Apr 12.
9
Maximizing sensitivity in medical diagnosis using biased minimax probability machine.使用有偏极小极大概率机最大化医学诊断中的灵敏度。
IEEE Trans Biomed Eng. 2006 May;53(5):821-31. doi: 10.1109/TBME.2006.872819.
10
Experiments with AdaBoost.RT, an improved boosting scheme for regression.使用AdaBoost.RT进行的实验,一种改进的回归增强方案。
Neural Comput. 2006 Jul;18(7):1678-710. doi: 10.1162/neco.2006.18.7.1678.

引用本文的文献

1
A systematic review and meta-analysis of artificial intelligence ECGs performance in the diagnosis of Brugada Syndrome.人工智能心电图在 Brugada 综合征诊断中性能的系统评价与荟萃分析。
J Interv Card Electrophysiol. 2025 Jun 4. doi: 10.1007/s10840-025-02075-y.
2
Does Cohort Selection Affect Machine Learning from Clinical Data?队列选择会影响基于临床数据的机器学习吗?
AMIA Annu Symp Proc. 2025 May 22;2024:473-482. eCollection 2024.
3
Deep generative modeling of annotated bacterial biofilm images.带注释的细菌生物膜图像的深度生成建模
NPJ Biofilms Microbiomes. 2025 Jan 14;11(1):16. doi: 10.1038/s41522-025-00647-4.
4
Identifying and handling data bias within primary healthcare data using synthetic data generators.使用合成数据生成器识别和处理初级医疗保健数据中的数据偏差。
Heliyon. 2024 Jan 10;10(2):e24164. doi: 10.1016/j.heliyon.2024.e24164. eCollection 2024 Jan 30.
5
Advanced Sampling Technique in Radiology Free-Text Data for Efficiently Building Text Mining Models by Deep Learning in Vertebral Fracture.放射学自由文本数据中的先进采样技术,用于通过深度学习在椎体骨折中高效构建文本挖掘模型。
Diagnostics (Basel). 2024 Jan 8;14(2):137. doi: 10.3390/diagnostics14020137.
6
Recent Advances in Bioimage Analysis Methods for Detecting Skeletal Deformities in Biomedical and Aquaculture Fish Species.生物医学和水产养殖鱼类骨骼畸形检测的生物图像分析方法的最新进展。
Biomolecules. 2023 Dec 14;13(12):1797. doi: 10.3390/biom13121797.
7
Evaluating the performance of machine-learning regression models for pharmacokinetic drug-drug interactions.评估机器学习回归模型在药代动力学药物相互作用中的性能。
CPT Pharmacometrics Syst Pharmacol. 2023 Jan;12(1):122-134. doi: 10.1002/psp4.12884. Epub 2022 Nov 17.
8
Inter-electrode correlations measured with EEG predict individual differences in cognitive ability.脑电图测量的电极间相关性可预测认知能力的个体差异。
Curr Biol. 2021 Nov 22;31(22):4998-5008.e6. doi: 10.1016/j.cub.2021.09.036. Epub 2021 Oct 11.
9
AMR-Diag: Neural network based genotype-to-phenotype prediction of resistance towards β-lactams in and .AMR-Diag:基于神经网络的大肠埃希菌和肺炎克雷伯菌对β-内酰胺类抗生素耐药性的基因型到表型预测
Comput Struct Biotechnol J. 2021 Mar 29;19:1896-1906. doi: 10.1016/j.csbj.2021.03.027. eCollection 2021.
10
The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets.在不平衡数据集上评估二元分类器时,精确率-召回率曲线比ROC曲线更具信息性。
PLoS One. 2015 Mar 4;10(3):e0118432. doi: 10.1371/journal.pone.0118432. eCollection 2015.