• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在推特上使用#ActuallyAutistic进行自闭症谱系障碍的精准诊断:机器学习研究

Using #ActuallyAutistic on Twitter for Precision Diagnosis of Autism Spectrum Disorder: Machine Learning Study.

作者信息

Jaiswal Aditi, Washington Peter

机构信息

Department of Information and Computer Sciences, University of Hawaii at Manoa, Honolulu, HI, United States.

出版信息

JMIR Form Res. 2024 Feb 14;8:e52660. doi: 10.2196/52660.

DOI:10.2196/52660
PMID:38354045
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10902768/
Abstract

BACKGROUND

The increasing use of social media platforms has given rise to an unprecedented surge in user-generated content, with millions of individuals publicly sharing their thoughts, experiences, and health-related information. Social media can serve as a useful means to study and understand public health. Twitter (subsequently rebranded as "X") is one such social media platform that has proven to be a valuable source of rich information for both the general public and health officials. We conducted the first study applying Twitter data mining to autism screening.

OBJECTIVE

This study used Twitter as the primary source of data to study the behavioral characteristics and real-time emotional projections of individuals identifying with autism spectrum disorder (ASD). We aimed to improve the rigor of ASD analytics research by using the digital footprint of an individual to study the linguistic patterns of individuals with ASD.

METHODS

We developed a machine learning model to distinguish individuals with autism from their neurotypical peers based on the textual patterns from their public communications on Twitter. We collected 6,515,470 tweets from users' self-identification with autism using "#ActuallyAutistic" and a separate control group to identify linguistic markers associated with ASD traits. To construct the data set, we targeted English-language tweets using the search query "#ActuallyAutistic" posted from January 1, 2014, to December 31, 2022. From these tweets, we identified unique users who used keywords such as "autism" OR "autistic" OR "neurodiverse" in their profile description and collected all the tweets from their timeline. To build the control group data set, we formulated a search query excluding the hashtag, "-#ActuallyAutistic," and collected 1000 tweets per day during the same time period. We trained a word2vec model and an attention-based, bidirectional long short-term memory model to validate the performance of per-tweet and per-profile classification models. We also illustrate the utility of the data set through common natural language processing tasks such as sentiment analysis and topic modeling.

RESULTS

Our tweet classifier reached a 73% accuracy, a 0.728 area under the receiver operating characteristic curve score, and an 0.71 F-score using word2vec representations fed into a logistic regression model, while the user profile classifier achieved an 0.78 area under the receiver operating characteristic curve score and an F-score of 0.805 using an attention-based, bidirectional long short-term memory model. This is a promising start, demonstrating the potential for effective digital phenotyping studies and large-scale intervention using text data mined from social media.

CONCLUSIONS

Textual differences in social media communications can help researchers and clinicians conduct symptomatology studies in natural settings.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41f5/10902768/ef3e974b3033/formative_v8i1e52660_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41f5/10902768/91f68e6721c2/formative_v8i1e52660_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41f5/10902768/ef3e974b3033/formative_v8i1e52660_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41f5/10902768/91f68e6721c2/formative_v8i1e52660_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41f5/10902768/ef3e974b3033/formative_v8i1e52660_fig2.jpg
摘要

背景

社交媒体平台使用的日益增加导致用户生成内容前所未有的激增,数百万人公开分享他们的想法、经历和与健康相关的信息。社交媒体可作为研究和理解公共卫生的有用手段。推特(后更名为“X”)就是这样一个社交媒体平台,已被证明是公众和卫生官员丰富信息的宝贵来源。我们开展了第一项将推特数据挖掘应用于自闭症筛查的研究。

目的

本研究将推特作为主要数据来源,以研究认同自闭症谱系障碍(ASD)的个体的行为特征和实时情绪投射。我们旨在通过利用个体的数字足迹来研究ASD个体的语言模式,提高ASD分析研究的严谨性。

方法

我们开发了一种机器学习模型,根据推特上公共交流的文本模式,将自闭症个体与其神经典型同龄人区分开来。我们使用“#ActuallyAutistic”从用户对自闭症的自我认同中收集了6515470条推文,并设立了一个单独的对照组,以识别与ASD特征相关的语言标记。为构建数据集,我们使用从2014年1月1日至2022年12月31日发布的搜索查询“#ActuallyAutistic”来定位英语推文。从这些推文中,我们识别出在个人资料描述中使用了“自闭症”“孤独症”或“神经多样性”等关键词的独特用户,并收集了他们时间轴上的所有推文。为构建对照组数据集,我们制定了一个排除该主题标签的搜索查询“-#ActuallyAutistic”,并在同一时间段内每天收集1000条推文。我们训练了一个词向量模型和一个基于注意力的双向长短期记忆模型,以验证每条推文和每个个人资料分类模型的性能。我们还通过情感分析和主题建模等常见的自然语言处理任务来说明数据集的效用。

结果

我们的推文分类器在将词向量表示输入逻辑回归模型时,准确率达到73%,受试者工作特征曲线下面积得分为0.728,F值为0.71;而用户资料分类器在使用基于注意力的双向长短期记忆模型时,受试者工作特征曲线下面积得分为0.78,F值为0.805。这是一个很有前景的开端,证明了利用从社交媒体挖掘的文本数据进行有效数字表型研究和大规模干预的潜力。

结论

社交媒体交流中的文本差异可帮助研究人员和临床医生在自然环境中开展症状学研究。

相似文献

1
Using #ActuallyAutistic on Twitter for Precision Diagnosis of Autism Spectrum Disorder: Machine Learning Study.在推特上使用#ActuallyAutistic进行自闭症谱系障碍的精准诊断:机器学习研究
JMIR Form Res. 2024 Feb 14;8:e52660. doi: 10.2196/52660.
2
Identifying Patients With Inflammatory Bowel Disease on Twitter and Learning From Their Personal Experience: Retrospective Cohort Study.在 Twitter 上识别炎症性肠病患者并从他们的个人经验中学习:回顾性队列研究。
J Med Internet Res. 2022 Aug 2;24(8):e29186. doi: 10.2196/29186.
3
Using Twitter to Detect Psychological Characteristics of Self-Identified Persons With Autism Spectrum Disorder: A Feasibility Study.利用推特(Twitter)检测自认为患有自闭症谱系障碍者的心理特征:一项可行性研究。
JMIR Mhealth Uhealth. 2019 Feb 12;7(2):e12264. doi: 10.2196/12264.
4
Twitter Discussions on #digitaldementia: Content and Sentiment Analysis.推特上关于#数字痴呆症的讨论:内容和情感分析。
J Med Internet Res. 2024 Jul 16;26:e59546. doi: 10.2196/59546.
5
Social media analysis of Twitter tweets related to ASD in 2019-2020, with particular attention to COVID-19: topic modelling and sentiment analysis.2019 - 2020年与自闭症谱系障碍(ASD)相关的推特推文的社交媒体分析,特别关注2019冠状病毒病(COVID - 19):主题建模与情感分析。
J Big Data. 2022;9(1):113. doi: 10.1186/s40537-022-00666-4. Epub 2022 Nov 25.
6
Digital Epidemiology of Prescription Drug References on X (Formerly Twitter): Neural Network Topic Modeling and Sentiment Analysis.X(前身为 Twitter)上处方药引用的数字流行病学:神经网络主题建模和情感分析。
J Med Internet Res. 2024 Aug 23;26:e57885. doi: 10.2196/57885.
7
Pediatric Cancer Communication on Twitter: Natural Language Processing and Qualitative Content Analysis.推特上的儿科癌症交流:自然语言处理与定性内容分析
JMIR Cancer. 2024 May 7;10:e52061. doi: 10.2196/52061.
8
Applying Multiple Data Collection Tools to Quantify Human Papillomavirus Vaccine Communication on Twitter.应用多种数据收集工具量化推特上的人乳头瘤病毒疫苗传播情况
J Med Internet Res. 2016 Dec 5;18(12):e318. doi: 10.2196/jmir.6670.
9
Exploring the Behavior of Users With Attention-Deficit/Hyperactivity Disorder on Twitter: Comparative Analysis of Tweet Content and User Interactions.探索 Twitter 上注意力缺陷多动障碍用户的行为:推文内容和用户互动的比较分析。
J Med Internet Res. 2023 May 17;25:e43439. doi: 10.2196/43439.
10
Using twitter to examine smoking behavior and perceptions of emerging tobacco products.利用推特研究吸烟行为及对新兴烟草产品的认知。
J Med Internet Res. 2013 Aug 29;15(8):e174. doi: 10.2196/jmir.2534.

引用本文的文献

1
Diagnosing autism spectrum disorders using a double deep Q-Network framework based on social media footprints.基于社交媒体足迹,使用双深度Q网络框架诊断自闭症谱系障碍。
Front Med (Lausanne). 2025 Aug 20;12:1646249. doi: 10.3389/fmed.2025.1646249. eCollection 2025.
2
The comprehensive clinical benefits of digital phenotyping: from broad adoption to full impact.数字表型分析的综合临床益处:从广泛应用到全面影响。
NPJ Digit Med. 2025 Apr 8;8(1):196. doi: 10.1038/s41746-025-01602-5.
3
Ethics of the Use of Social Media as Training Data for AI Models Used for Digital Phenotyping.

本文引用的文献

1
Identity and Discourse Among #ActuallyAutistic Twitter Users With Motor Differences.#实际自闭症# 运动差异的推特用户中的身份认同与话语
J Mot Learn Dev. 2023 Dec;11(3):525-540. doi: 10.1123/jmld.2023-0007. Epub 2023 Aug 22.
2
A Review of and Roadmap for Data Science and Machine Learning for the Neuropsychiatric Phenotype of Autism.自闭症神经精神表型的数据科学和机器学习综述及路线图。
Annu Rev Biomed Data Sci. 2023 Aug 10;6:211-228. doi: 10.1146/annurev-biodatasci-020722-125454. Epub 2023 May 3.
3
The Influence of Social Media on the Perception of Autism Spectrum Disorders: Content Analysis of Public Discourse on YouTube Videos.
将社交媒体用作数字表型分析所用人工智能模型训练数据的伦理问题。
JMIR Form Res. 2024 Jul 17;8:e59794. doi: 10.2196/59794.
社交媒体对自闭症谱系障碍认知的影响:YouTube 视频公共话语的内容分析。
Int J Environ Res Public Health. 2023 Feb 13;20(4):3246. doi: 10.3390/ijerph20043246.
4
Social media analysis of Twitter tweets related to ASD in 2019-2020, with particular attention to COVID-19: topic modelling and sentiment analysis.2019 - 2020年与自闭症谱系障碍(ASD)相关的推特推文的社交媒体分析,特别关注2019冠状病毒病(COVID - 19):主题建模与情感分析。
J Big Data. 2022;9(1):113. doi: 10.1186/s40537-022-00666-4. Epub 2022 Nov 25.
5
Applied Behavior Analysis as Treatment for Autism Spectrum Disorders: Topic Modeling and Linguistic Analysis of Reddit Posts.应用行为分析作为自闭症谱系障碍的治疗方法:Reddit帖子的主题建模与语言分析
Front Rehabil Sci. 2022 Jan 5;2:682533. doi: 10.3389/fresc.2021.682533. eCollection 2021.
6
Training and Profiling a Pediatric Facial Expression Classifier for Children on Mobile Devices: Machine Learning Study.在移动设备上为儿童训练和分析儿科面部表情分类器:机器学习研究
JMIR Form Res. 2023 Mar 21;7:e39917. doi: 10.2196/39917.
7
Classifying Autism From Crowdsourced Semistructured Speech Recordings: Machine Learning Model Comparison Study.从众包半结构化语音记录中分类自闭症:机器学习模型比较研究。
JMIR Pediatr Parent. 2022 Apr 14;5(2):e35406. doi: 10.2196/35406.
8
Improved Digital Therapy for Developmental Pediatrics Using Domain-Specific Artificial Intelligence: Machine Learning Study.使用特定领域人工智能改进发育儿科学数字疗法:机器学习研究
JMIR Pediatr Parent. 2022 Apr 8;5(2):e26760. doi: 10.2196/26760.
9
Identification of Social Engagement Indicators Associated With Autism Spectrum Disorder Using a Game-Based Mobile App: Comparative Study of Gaze Fixation and Visual Scanning Methods.基于游戏的移动应用程序识别自闭症谱系障碍相关社会参与指标:注视点追踪和视觉扫描方法的比较研究。
J Med Internet Res. 2022 Feb 15;24(2):e31830. doi: 10.2196/31830.
10
Eye gaze as a biomarker in the recognition of autism spectrum disorder using virtual reality and machine learning: A proof of concept for diagnosis.眼动追踪作为虚拟现实和机器学习在自闭症谱系障碍识别中的生物标志物:用于诊断的概念验证。
Autism Res. 2022 Jan;15(1):131-145. doi: 10.1002/aur.2636. Epub 2021 Nov 22.