• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过电子健康记录中的自然语言处理开发自闭症谱系障碍的表型本体。

Development of a phenotype ontology for autism spectrum disorder by natural language processing on electronic health records.

机构信息

Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA.

School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, 19104, USA.

出版信息

J Neurodev Disord. 2022 May 23;14(1):32. doi: 10.1186/s11689-022-09442-0.

DOI:10.1186/s11689-022-09442-0
PMID:35606697
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9128253/
Abstract

BACKGROUND

Autism spectrum disorder (ASD) is a complex neurodevelopmental condition characterized by restricted, repetitive behavior, and impaired social communication and interactions. However, significant challenges remain in diagnosing and subtyping ASD due in part to the lack of a validated, standardized vocabulary to characterize clinical phenotypic presentation of ASD. Although the human phenotype ontology (HPO) plays an important role in delineating nuanced phenotypes for rare genetic diseases, it is inadequate to capture characteristic of behavioral and psychiatric phenotypes for individuals with ASD. There is a clear need, therefore, for a well-established phenotype terminology set that can assist in characterization of ASD phenotypes from patients' clinical narratives.

METHODS

To address this challenge, we used natural language processing (NLP) techniques to identify and curate ASD phenotypic terms from high-quality unstructured clinical notes in the electronic health record (EHR) on 8499 individuals with ASD, 8177 individuals with non-ASD psychiatric disorders, and 8482 individuals without a documented psychiatric disorder. We further performed dimensional reduction clustering analysis to subgroup individuals with ASD, using nonnegative matrix factorization method.

RESULTS

Through a note-processing pipeline that includes several steps of state-of-the-art NLP approaches, we identified 3336 ASD terms linking to 1943 unique medical concepts, which represents among the largest ASD terminology set to date. The extracted ASD terms were further organized in a formal ontology structure similar to the HPO. Clustering analysis showed that these terms could be used in a diagnostic pipeline to differentiate individuals with ASD from individuals with other psychiatric disorders.

CONCLUSION

Our ASD phenotype ontology can assist clinicians and researchers in characterizing individuals with ASD, facilitating automated diagnosis, and subtyping individuals with ASD to facilitate personalized therapeutic decision-making.

摘要

背景

自闭症谱系障碍(ASD)是一种复杂的神经发育障碍,其特征是受限的、重复的行为,以及受损的社交沟通和互动。然而,由于缺乏经过验证的标准化词汇来描述 ASD 的临床表型表现,因此在诊断和亚分类 ASD 方面仍然存在重大挑战。尽管人类表型本体(HPO)在描绘罕见遗传疾病的细微表型方面发挥着重要作用,但它不足以捕捉 ASD 患者的行为和精神表型特征。因此,显然需要建立一个完善的表型术语集,以协助从患者的临床叙述中描述 ASD 表型。

方法

为了解决这一挑战,我们使用自然语言处理(NLP)技术从电子健康记录(EHR)中的 8499 名 ASD 患者、8177 名非 ASD 精神障碍患者和 8482 名无记录精神障碍患者的高质量非结构化临床记录中识别和编纂 ASD 表型术语。我们进一步使用非负矩阵分解方法对 ASD 患者进行降维聚类分析。

结果

通过包括几个最先进的 NLP 方法步骤的笔记处理管道,我们确定了 3336 个与 1943 个独特医学概念相关的 ASD 术语,这是迄今为止最大的 ASD 术语集之一。提取的 ASD 术语进一步组织成类似于 HPO 的正式本体结构。聚类分析表明,这些术语可用于诊断管道,以区分 ASD 患者和其他精神障碍患者。

结论

我们的 ASD 表型本体可以帮助临床医生和研究人员描述 ASD 患者,促进自动化诊断,并对 ASD 患者进行亚分类,以促进个性化治疗决策。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b6f1/9128253/2f5aef86a008/11689_2022_9442_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b6f1/9128253/b8c97b95e684/11689_2022_9442_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b6f1/9128253/6b2990c103cd/11689_2022_9442_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b6f1/9128253/0cb06472d5c0/11689_2022_9442_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b6f1/9128253/9ba881832a63/11689_2022_9442_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b6f1/9128253/2f5aef86a008/11689_2022_9442_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b6f1/9128253/b8c97b95e684/11689_2022_9442_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b6f1/9128253/6b2990c103cd/11689_2022_9442_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b6f1/9128253/0cb06472d5c0/11689_2022_9442_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b6f1/9128253/9ba881832a63/11689_2022_9442_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b6f1/9128253/2f5aef86a008/11689_2022_9442_Fig5_HTML.jpg

相似文献

1
Development of a phenotype ontology for autism spectrum disorder by natural language processing on electronic health records.通过电子健康记录中的自然语言处理开发自闭症谱系障碍的表型本体。
J Neurodev Disord. 2022 May 23;14(1):32. doi: 10.1186/s11689-022-09442-0.
2
Automated Extraction of Diagnostic Criteria From Electronic Health Records for Autism Spectrum Disorders: Development, Evaluation, and Application.从电子健康记录中自动提取自闭症谱系障碍的诊断标准:开发、评估与应用
J Med Internet Res. 2018 Nov 7;20(11):e10497. doi: 10.2196/10497.
3
Data-driven method to enhance craniofacial and oral phenotype vocabularies.基于数据驱动的方法来增强颅面和口腔表型词汇。
J Am Dent Assoc. 2019 Nov;150(11):933-939.e2. doi: 10.1016/j.adaj.2019.05.029.
4
Clinical Phenotypic Spectrum of 4095 Individuals with Down Syndrome from Text Mining of Electronic Health Records.从电子健康记录的文本挖掘中分析 4095 例唐氏综合征个体的临床表型谱。
Genes (Basel). 2021 Jul 28;12(8):1159. doi: 10.3390/genes12081159.
5
Ensembles of natural language processing systems for portable phenotyping solutions.用于便携表型解决方案的自然语言处理系统集合。
J Biomed Inform. 2019 Dec;100:103318. doi: 10.1016/j.jbi.2019.103318. Epub 2019 Oct 23.
6
Natural language processing (NLP) tools in extracting biomedical concepts from research articles: a case study on autism spectrum disorder.自然语言处理(NLP)工具在从研究文章中提取生物医学概念中的应用:以自闭症谱系障碍为例。
BMC Med Inform Decis Mak. 2020 Dec 30;20(Suppl 11):322. doi: 10.1186/s12911-020-01352-2.
7
An ontology for Autism Spectrum Disorder (ASD) to infer ASD phenotypes from Autism Diagnostic Interview-Revised data.一种用于自闭症谱系障碍(ASD)的本体,用于从自闭症诊断访谈修订版数据中推断ASD表型。
J Biomed Inform. 2015 Aug;56:333-47. doi: 10.1016/j.jbi.2015.06.026. Epub 2015 Jul 4.
8
Understanding the Relationship between Social Cognition and Word Difficulty. A Language Based Analysis of Individuals with Autism Spectrum Disorder.理解社会认知与单词难度之间的关系:对自闭症谱系障碍个体的基于语言的分析
Methods Inf Med. 2015;54(6):522-9. doi: 10.3414/ME15-01-0038. Epub 2015 Sep 22.
9
Characterization of autism spectrum disorder and neurodevelopmental profiles in youth with XYY syndrome.XYY 综合征青少年孤独症谱系障碍及神经发育特征分析。
J Neurodev Disord. 2018 Oct 22;10(1):30. doi: 10.1186/s11689-018-9248-7.
10
Optimizing Corpus Creation for Training Word Embedding in Low Resource Domains: A Case Study in Autism Spectrum Disorder (ASD).优化低资源领域中训练词嵌入的语料库创建:以自闭症谱系障碍(ASD)为例
AMIA Annu Symp Proc. 2018 Dec 5;2018:508-517. eCollection 2018.

引用本文的文献

1
Multimodal AI for risk stratification in autism spectrum disorder: integrating voice and screening tools.用于自闭症谱系障碍风险分层的多模态人工智能:整合语音和筛查工具。
NPJ Digit Med. 2025 Aug 21;8(1):538. doi: 10.1038/s41746-025-01914-6.
2
Efficacy and Safety of Altibrain as an Adjunctive Therapy for Autism Spectrum Disorder: An Open Label Trial Targeting Core Symptoms.Altibrain作为自闭症谱系障碍辅助治疗的疗效和安全性:一项针对核心症状的开放标签试验
Curr Pharm Des. 2025;31(17):1388-1401. doi: 10.2174/0113816128335544241210144541.
3
Implications of mappings between International Classification of Diseases clinical diagnosis codes and Human Phenotype Ontology terms.

本文引用的文献

1
Natural language processing (NLP) tools in extracting biomedical concepts from research articles: a case study on autism spectrum disorder.自然语言处理(NLP)工具在从研究文章中提取生物医学概念中的应用:以自闭症谱系障碍为例。
BMC Med Inform Decis Mak. 2020 Dec 30;20(Suppl 11):322. doi: 10.1186/s12911-020-01352-2.
2
BioBERT: a pre-trained biomedical language representation model for biomedical text mining.BioBERT:一种用于生物医学文本挖掘的预训练生物医学语言表示模型。
Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.
3
CLAMP - a toolkit for efficiently building customized clinical natural language processing pipelines.
国际疾病分类临床诊断编码与人类表型本体术语之间映射的意义。
JAMIA Open. 2024 Nov 18;7(4):ooae118. doi: 10.1093/jamiaopen/ooae118. eCollection 2024 Dec.
4
Enhancing phenotype recognition in clinical notes using large language models: PhenoBCBERT and PhenoGPT.使用大语言模型增强临床笔记中的表型识别:PhenoBCBERT和PhenoGPT。
Patterns (N Y). 2023 Dec 5;5(1):100887. doi: 10.1016/j.patter.2023.100887. eCollection 2024 Jan 12.
5
Semantics-enabled biomedical literature analytics.支持语义分析的生物医学文献分析
J Biomed Inform. 2024 Feb;150:104588. doi: 10.1016/j.jbi.2024.104588. Epub 2024 Jan 19.
6
Enhancing Phenotype Recognition in Clinical Notes Using Large Language Models: PhenoBCBERT and PhenoGPT.使用大语言模型增强临床记录中的表型识别:PhenoBCBERT和PhenoGPT
ArXiv. 2023 Nov 9:arXiv:2308.06294v2.
7
Assessment of autonomic symptom scales in patients with neurodevelopmental diagnoses using electronic health record data.利用电子健康记录数据评估神经发育诊断患者的自主神经症状量表
Res Autism Spectr Disord. 2023 Oct;108. doi: 10.1016/j.rasd.2023.102234. Epub 2023 Sep 22.
8
Genomic architecture of autism spectrum disorder in Qatar: The BARAKA-Qatar Study.卡塔尔自闭症谱系障碍的基因组结构:BARAKA-Qatar 研究。
Genome Med. 2023 Oct 7;15(1):81. doi: 10.1186/s13073-023-01228-w.
CLAMP - 一个用于高效构建定制化临床自然语言处理管道的工具包。
J Am Med Inform Assoc. 2018 Mar 1;25(3):331-336. doi: 10.1093/jamia/ocx132.
4
Electronic Health Record Based Algorithm to Identify Patients with Autism Spectrum Disorder.基于电子健康记录的自闭症谱系障碍患者识别算法
PLoS One. 2016 Jul 29;11(7):e0159621. doi: 10.1371/journal.pone.0159621. eCollection 2016.
5
The Protégé Project: A Look Back and a Look Forward.Protégé项目:回顾与展望。
AI Matters. 2015 Jun;1(4):4-12. doi: 10.1145/2757001.2757003.
6
An ontology for Autism Spectrum Disorder (ASD) to infer ASD phenotypes from Autism Diagnostic Interview-Revised data.一种用于自闭症谱系障碍(ASD)的本体,用于从自闭症诊断访谈修订版数据中推断ASD表型。
J Biomed Inform. 2015 Aug;56:333-47. doi: 10.1016/j.jbi.2015.06.026. Epub 2015 Jul 4.
7
Modeling the autism spectrum disorder phenotype.自闭症谱系障碍表型建模。
Neuroinformatics. 2014 Apr;12(2):291-305. doi: 10.1007/s12021-013-9211-4.
8
Explaining differences in age at autism spectrum disorder diagnosis: a critical review.解释自闭症谱系障碍诊断年龄的差异:一项批判性综述。
Autism. 2014 Jul;18(5):583-97. doi: 10.1177/1362361313480277. Epub 2013 Jun 20.
9
Autism in DSM-5: progress and challenges.DSM-5 中的自闭症:进展与挑战。
Mol Autism. 2013 May 15;4(1):13. doi: 10.1186/2040-2392-4-13.
10
Sensitivity and specificity of proposed DSM-5 diagnostic criteria for autism spectrum disorder.DSM-5 自闭症谱系障碍诊断标准的敏感性和特异性。
J Am Acad Child Adolesc Psychiatry. 2012 Apr;51(4):368-83. doi: 10.1016/j.jaac.2012.01.007. Epub 2012 Mar 14.