Suppr超能文献

不太可能的头脑风暴:使用语言模型生成替代假设。

Less Likely Brainstorming: Using Language Models to Generate Alternative Hypotheses.

作者信息

Tang Liyan, Peng Yifan, Wang Yanshan, Ding Ying, Durrett Greg, Rousseau Justin F

机构信息

The University of Texas at Austin.

Weill Cornell Medicine.

出版信息

Proc Conf Assoc Comput Linguist Meet. 2023 Jul;2023:12532-12555. doi: 10.18653/v1/2023.findings-acl.794.

Abstract

A human decision-maker benefits the most from an AI assistant that corrects for their biases. For problems such as generating interpretation of a radiology report given findings, a system predicting only highly likely outcomes may be less useful, where such outcomes are already obvious to the user. To alleviate biases in human decision-making, it is worth considering a broad differential diagnosis, going beyond the most likely options. We introduce a new task, "less likely brainstorming," that asks a model to generate outputs that humans think are relevant but less likely to happen. We explore the task in two settings: a brain MRI interpretation generation setting and an everyday commonsense reasoning setting. We found that a baseline approach of training with less likely hypotheses as targets generates outputs that humans evaluate as either likely or irrelevant nearly half of the time; standard MLE training is not effective. To tackle this problem, we propose a controlled text generation method that uses a novel contrastive learning strategy to encourage models to differentiate between generating likely and less likely outputs according to humans. We compare our method with several state-of-the-art controlled text generation models via automatic and human evaluations and show that our models' capability of generating less likely outputs is improved.

摘要

人类决策者从能够纠正其偏差的人工智能助手那里受益最大。对于诸如根据检查结果生成放射学报告解读等问题,一个只预测极有可能结果的系统可能用处不大,因为这些结果对用户来说已经很明显了。为了减轻人类决策中的偏差,值得考虑进行广泛的鉴别诊断,而不仅仅局限于最有可能的选项。我们引入了一项新任务,即“不太可能的头脑风暴”,该任务要求模型生成人类认为相关但发生可能性较小的输出。我们在两种场景下探索这项任务:脑磁共振成像解读生成场景和日常常识推理场景。我们发现,以不太可能的假设为目标进行训练的基线方法所生成的输出,近一半时间被人类评估为要么可能要么不相关;标准的最大似然估计训练并不有效。为了解决这个问题,我们提出了一种可控文本生成方法,该方法使用一种新颖的对比学习策略,鼓励模型根据人类的判断来区分生成可能和不太可能的输出。我们通过自动评估和人工评估将我们的方法与几种先进的可控文本生成模型进行比较,结果表明我们的模型生成不太可能输出的能力得到了提高。

相似文献

1
Less Likely Brainstorming: Using Language Models to Generate Alternative Hypotheses.不太可能的头脑风暴:使用语言模型生成替代假设。
Proc Conf Assoc Comput Linguist Meet. 2023 Jul;2023:12532-12555. doi: 10.18653/v1/2023.findings-acl.794.
5
The future of Cochrane Neonatal.考克兰新生儿协作网的未来。
Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12.

本文引用的文献

5
Bias in Radiology: The How and Why of Misses and Misinterpretations.放射学中的偏倚:漏诊和误诊的原因与方式。
Radiographics. 2018 Jan-Feb;38(1):236-247. doi: 10.1148/rg.2018170107. Epub 2017 Dec 1.
6
Interpretive Error in Radiology.放射学中的解释性错误。
AJR Am J Roentgenol. 2017 Apr;208(4):739-749. doi: 10.2214/AJR.16.16963. Epub 2016 Dec 27.
10
Variations in physician practice: the role of uncertainty.医生实践中的差异:不确定性的作用。
Health Aff (Millwood). 1984 Summer;3(2):74-89. doi: 10.1377/hlthaff.3.2.74.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验