Sim Jin-Ah, Huang Xiaolei, Webster Rachel T, Srivastava Kumar, Ness Kirsten K, Hudson Melissa M, Baker Justin N, Huang I-Chan
Department of Epidemiology and Cancer Control, St Jude Children's Research Hospital, Memphis, TN 38105, United States.
Department of AI Convergence, Hallym University, Chuncheon, Gangwon 24252, Republic of Korea.
JAMIA Open. 2025 Mar 26;8(2):ooaf018. doi: 10.1093/jamiaopen/ooaf018. eCollection 2025 Apr.
To determine if natural language processing (NLP) and machine learning (ML) techniques accurately identify interview-based psychological stress and meaning/purpose data in child/adolescent cancer survivors.
Interviews were conducted with 51 survivors (aged 8-17.9 years; ≥5-years post-therapy) from St Jude Children's Research Hospital. Two content experts coded 244 and 513 semantic units, focusing on attributes of psychological stress (anger, controllability/manageability, fear/anxiety) and attributes of meaning/purpose (goal, optimism, purpose). Content experts extracted specific attributes from the interviews, which were designated as the gold standard. Two NLP/ML methods, Word2Vec with Extreme Gradient Boosting (XGBoost), and Bidirectional Encoder Representations from Transformers Large (BERT), were validated using accuracy, areas under the receiver operating characteristic curves (AUROCC), and under the precision-recall curves (AUPRC).
BERT demonstrated higher accuracy, AUROCC, and AUPRC in identifying all attributes of psychological stress and meaning/purpose versus Word2Vec/XGBoost. BERT significantly outperformed Word2Vec/XGBoost in characterizing all attributes ( <.05) except for the purpose attribute of meaning/purpose.
These findings suggest that AI tools can help healthcare providers efficiently assess emotional well-being of childhood cancer survivors, supporting future clinical interventions.
NLP/ML effectively identifies interview-based data for child/adolescent cancer survivors.
确定自然语言处理(NLP)和机器学习(ML)技术能否准确识别儿童/青少年癌症幸存者基于访谈的心理压力以及意义/目的数据。
对来自圣裘德儿童研究医院的51名幸存者(年龄8 - 17.9岁;治疗后≥5年)进行访谈。两名内容专家对244个和513个语义单元进行编码,重点关注心理压力属性(愤怒、可控性/可管理性、恐惧/焦虑)和意义/目的属性(目标、乐观、目的)。内容专家从访谈中提取特定属性,将其指定为金标准。使用准确率、受试者工作特征曲线下面积(AUROCC)和精确召回率曲线下面积(AUPRC)对两种NLP/ML方法进行验证,这两种方法分别是带有极端梯度提升(XGBoost)的Word2Vec和来自变换器大模型(BERT)的双向编码器表征。
与Word2Vec/XGBoost相比,BERT在识别心理压力和意义/目的的所有属性方面表现出更高的准确率、AUROCC和AUPRC。除意义/目的的目的属性外,BERT在表征所有属性方面显著优于Word2Vec/XGBoost(P <.05)。
这些发现表明,人工智能工具可以帮助医疗保健提供者有效评估儿童癌症幸存者的情绪健康状况,为未来的临床干预提供支持。
NLP/ML能有效识别儿童/青少年癌症幸存者基于访谈的数据。