Lojun Sharon L, Sauper Christina J, Medow Mitchell, Long William J, Mark Roger G, Barzilay Regina
Boston University Medical Center, Boston, MA.
AMIA Annu Symp Proc. 2010 Nov 13;2010:467-71.
This study investigates the feasibility of using structured data (age, gender, and medical condition), and unstructured medical notes on classification accuracy for resuscitation code status. Data was extracted from the MIMICII database. Natural language processing (NLP) was used to evaluate the social section of the nurses' progress notes. BoosTexter was used to predict the code-status using notes, age, gender, and Simplified Acute Physiology Score (SAPS). The relative impact of features was analyzed by feature ablation. Unstructured notes were the greatest single indicator of code status. The addition of text to medical condition features increased classification accuracy significantly (p<0.001.) N-gram frequency was analyzed. Gender differences were noted across all code-statuses, with women always more frequent (e.g. wife>husband.) Logistic regression on structured features was used determine gender bias or ageism. Evidence of bias was found; both females (OR=1.45) and patients over age 70 (OR=3.72) were more likely to be Do-Not-Resuscitate (DNR).
本研究探讨使用结构化数据(年龄、性别和医疗状况)以及非结构化医疗记录对复苏代码状态进行分类的准确性的可行性。数据从MIMICII数据库中提取。使用自然语言处理(NLP)来评估护士病程记录的社会部分。使用BoosTexter通过记录、年龄、性别和简化急性生理学评分(SAPS)来预测代码状态。通过特征消融分析特征的相对影响。非结构化记录是代码状态的最大单一指标。在医疗状况特征中添加文本显著提高了分类准确性(p<0.001)。分析了N-gram频率。在所有代码状态中都注意到了性别差异,女性出现的频率总是更高(例如妻子>丈夫)。使用结构化特征的逻辑回归来确定性别偏见或年龄歧视。发现了偏见的证据;女性(OR=1.45)和70岁以上的患者(OR=3.72)更有可能被判定为不进行心肺复苏(DNR)。