• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
An annotated data set for identifying women reporting adverse pregnancy outcomes on Twitter.一个用于识别在推特上报告不良妊娠结局的女性的注释数据集。
Data Brief. 2020 Aug 31;32:106249. doi: 10.1016/j.dib.2020.106249. eCollection 2020 Oct.
2
A natural language processing pipeline to advance the use of Twitter data for digital epidemiology of adverse pregnancy outcomes.一种自然语言处理流程,以促进将推特数据用于不良妊娠结局的数字流行病学研究。
J Biomed Inform. 2020;112S:100076. doi: 10.1016/j.yjbinx.2020.100076. Epub 2020 Aug 8.
3
Social media mining for birth defects research: A rule-based, bootstrapping approach to collecting data for rare health-related events on Twitter.社交媒体挖掘在出生缺陷研究中的应用:一种基于规则和自举的方法,用于在 Twitter 上收集罕见健康相关事件的数据。
J Biomed Inform. 2018 Nov;87:68-78. doi: 10.1016/j.jbi.2018.10.001. Epub 2018 Oct 4.
4
Using Twitter Data for Cohort Studies of Drug Safety in Pregnancy: Proof-of-concept With β-Blockers.利用推特数据进行孕期药物安全性队列研究:以β受体阻滞剂为例的概念验证
JMIR Form Res. 2022 Jun 30;6(6):e36771. doi: 10.2196/36771.
5
Automatically Identifying Comparator Groups on Twitter for Digital Epidemiology of Pregnancy Outcomes.自动识别推特上用于妊娠结局数字流行病学研究的对照人群组。
AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:317-325. eCollection 2020.
6
Towards scaling Twitter for digital epidemiology of birth defects.迈向扩大推特在出生缺陷数字流行病学中的应用规模。
NPJ Digit Med. 2019 Oct 1;2:96. doi: 10.1038/s41746-019-0170-5. eCollection 2019.
7
Automatically Identifying Twitter Users for Interventions to Support Dementia Family Caregivers: Annotated Data Set and Benchmark Classification Models.自动识别用于支持痴呆症家庭照顾者干预措施的推特用户:带注释的数据集和基准分类模型
JMIR Aging. 2022 Sep 16;5(3):e39547. doi: 10.2196/39547.
8
Toward Using Twitter for Tracking COVID-19: A Natural Language Processing Pipeline and Exploratory Data Set.用于追踪 COVID-19 的 Twitter:自然语言处理管道和探索性数据集。
J Med Internet Res. 2021 Jan 22;23(1):e25314. doi: 10.2196/25314.
9
ReportAGE: Automatically extracting the exact age of Twitter users based on self-reports in tweets.ReportAGE:基于用户在推文中的自我报告自动提取 Twitter 用户的准确年龄。
PLoS One. 2022 Jan 25;17(1):e0262087. doi: 10.1371/journal.pone.0262087. eCollection 2022.
10
Using Longitudinal Twitter Data for Digital Epidemiology of Childhood Health Outcomes: An Annotated Data Set and Deep Neural Network Classifiers.利用纵向 Twitter 数据进行儿童健康结局的数字流行病学研究:一个带注释的数据集和深度神经网络分类器。
J Med Internet Res. 2024 Mar 25;26:e50652. doi: 10.2196/50652.

引用本文的文献

1
Generalizable Natural Language Processing Framework for Migraine Reporting from Social Media.用于社交媒体偏头痛报告的通用自然语言处理框架
AMIA Jt Summits Transl Sci Proc. 2023 Jun 16;2023:261-270. eCollection 2023.
2
MonkeyPox2022Tweets: A Large-Scale Twitter Dataset on the 2022 Monkeypox Outbreak, Findings from Analysis of Tweets, and Open Research Questions.猴痘2022年推文:关于2022年猴痘疫情的大规模推特数据集、推文分析结果及开放性研究问题
Infect Dis Rep. 2022 Nov 14;14(6):855-883. doi: 10.3390/idr14060087.
3
Comparison of Pretraining Models and Strategies for Health-Related Social Media Text Classification.与健康相关的社交媒体文本分类的预训练模型和策略比较。
Healthcare (Basel). 2022 Aug 5;10(8):1478. doi: 10.3390/healthcare10081478.
4
A natural language processing pipeline to advance the use of Twitter data for digital epidemiology of adverse pregnancy outcomes.一种自然语言处理流程,以促进将推特数据用于不良妊娠结局的数字流行病学研究。
J Biomed Inform. 2020;112S:100076. doi: 10.1016/j.yjbinx.2020.100076. Epub 2020 Aug 8.

本文引用的文献

1
A natural language processing pipeline to advance the use of Twitter data for digital epidemiology of adverse pregnancy outcomes.一种自然语言处理流程,以促进将推特数据用于不良妊娠结局的数字流行病学研究。
J Biomed Inform. 2020;112S:100076. doi: 10.1016/j.yjbinx.2020.100076. Epub 2020 Aug 8.
2
An unsupervised and customizable misspelling generator for mining noisy health-related text sources.一种用于挖掘噪声健康相关文本源的无监督和可定制的拼写错误生成器。
J Biomed Inform. 2018 Dec;88:98-107. doi: 10.1016/j.jbi.2018.11.007. Epub 2018 Nov 13.
3
Social media mining for birth defects research: A rule-based, bootstrapping approach to collecting data for rare health-related events on Twitter.社交媒体挖掘在出生缺陷研究中的应用:一种基于规则和自举的方法,用于在 Twitter 上收集罕见健康相关事件的数据。
J Biomed Inform. 2018 Nov;87:68-78. doi: 10.1016/j.jbi.2018.10.001. Epub 2018 Oct 4.
4
Pharmacoepidemiologic Evaluation of Birth Defects from Health-Related Postings in Social Media During Pregnancy.孕期社交媒体健康相关帖子致出生缺陷的药物流行病学评价
Drug Saf. 2019 Mar;42(3):389-400. doi: 10.1007/s40264-018-0731-6.
5
Deaths: Final Data for 2016.死亡:2016年最终数据。
Natl Vital Stat Rep. 2018 Jul;67(5):1-76.
6
Discovering Cohorts of Pregnant Women From Social Media for Safety Surveillance and Analysis.从社交媒体中发现孕妇群体以进行安全监测与分析。
J Med Internet Res. 2017 Oct 30;19(10):e361. doi: 10.2196/jmir.8164.
7
Fetal and Perinatal Mortality: United States, 2013.《2013年美国胎儿及围产期死亡率》
Natl Vital Stat Rep. 2015 Jul 23;64(8):1-24.
8
Comparison of the aetiology of stillbirth over five decades in a single centre: a retrospective study.单中心五十年间死产病因的比较:一项回顾性研究。
BMJ Open. 2014 Jun 5;4(6):e004635. doi: 10.1136/bmjopen-2013-004635.
9
A systematic review to calculate background miscarriage rates using life table analysis.一项使用生命表分析计算背景流产率的系统评价。
Birth Defects Res A Clin Mol Teratol. 2012 Jun;94(6):417-23. doi: 10.1002/bdra.23014. Epub 2012 Apr 18.
10
Spontaneous preterm birth, a clinical dilemma: etiologic, pathophysiologic and genetic heterogeneities and racial disparity.自发性早产:一种临床困境——病因、病理生理及基因异质性与种族差异
Acta Obstet Gynecol Scand. 2008;87(6):590-600. doi: 10.1080/00016340802005126.

一个用于识别在推特上报告不良妊娠结局的女性的注释数据集。

An annotated data set for identifying women reporting adverse pregnancy outcomes on Twitter.

作者信息

Klein Ari Z, Gonzalez-Hernandez Graciela

机构信息

University of Pennsylvania, Philadelphia, PA, USA.

出版信息

Data Brief. 2020 Aug 31;32:106249. doi: 10.1016/j.dib.2020.106249. eCollection 2020 Oct.

DOI:10.1016/j.dib.2020.106249
PMID:32944604
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7481818/
Abstract

Despite the prevalence in the United States of miscarriage [1], stillbirth [2], and infant mortality associated with preterm birth and low birthweight [3], their causes remain largely unknown [4], [5], [6]. To advance the use of social media data as a complementary resource for epidemiology of adverse pregnancy outcomes, we present a data set of 6487 tweets that mention miscarriage, stillbirth, preterm birth or premature labor, low birthweight, neonatal intensive care, or fetal/infant loss in general. These tweets are a subset of 22,912 tweets retrieved by applying hand-written regular expressions to a database containing more than 400 million public tweets posted by more than 100,000 women who have announced their pregnancy on Twitter [7]. Two professional annotators labeled the 6487 tweets in a binary fashion, distinguishing those potentially reporting that the user has personally experienced the outcome ("outcome" tweets) from those that merely mention the outcome ("non-outcome" tweets). Inter-annotator agreement was κ = 0.90 (Cohen's kappa). The tweets annotated as "outcome" include 1318 women reporting miscarriage, 94 stillbirth, 591 preterm birth or premature labor, 171 low birthweight, 453 neonatal intensive care, and 356 fetal/infant loss in general. These "outcome" tweets can be used to explore patient experiences and perceptions of adverse pregnancy outcomes, and can direct researchers to the users' broader timelines-tweets posted by a user over time-for observational studies. Our past work demonstrates the analysis of timelines for selecting a study population [8] and conducting a case-control study [9] of users reporting that their child has a birth defect. For larger-scale studies, the full annotated corpus can be used to train supervised machine learning algorithms to automatically identify additional users reporting adverse pregnancy outcomes on Twitter. We used the annotated corpus to train feature-engineered and deep learning-based classifiers presented in "A natural language processing pipeline to advance the use of Twitter data for digital epidemiology of adverse pregnancy outcomes" [10].

摘要

尽管在美国流产[1]、死产[2]以及与早产和低出生体重相关的婴儿死亡很常见[3],但其原因在很大程度上仍不为人知[4,5,6]。为了推动将社交媒体数据用作不良妊娠结局流行病学的补充资源,我们展示了一个包含6487条推文的数据集,这些推文提及流产、死产、早产或早产、低出生体重、新生儿重症监护或一般的胎儿/婴儿死亡。这些推文是通过将手写正则表达式应用于一个数据库而检索到的22912条推文的子集,该数据库包含超过1亿条由10万多名在推特上宣布怀孕的女性发布的公开推文[7]。两名专业注释者以二元方式对这6487条推文进行了标注,区分那些可能报告用户个人经历了该结局的推文(“结局”推文)和那些仅仅提及该结局的推文(“非结局”推文)。注释者间一致性为κ = 0.90(科恩kappa系数)。被标注为“结局”的推文包括1318名报告流产的女性、94例死产、591例早产或早产、171例低出生体重、453例新生儿重症监护以及356例一般的胎儿/婴儿死亡。这些“结局”推文可用于探索患者对不良妊娠结局的经历和看法,并能引导研究人员查看用户更广泛的时间线——用户随时间发布的推文——用于观察性研究。我们过去的工作展示了对时间线进行分析以选择研究人群[8]以及对报告孩子有出生缺陷的用户进行病例对照研究[9]。对于更大规模的研究,完整的注释语料库可用于训练监督机器学习算法,以自动识别推特上报告不良妊娠结局的其他用户。我们使用该注释语料库训练了《用于推进推特数据在不良妊娠结局数字流行病学中应用的自然语言处理管道》[10]中提出的基于特征工程和深度学习的分类器。