文献检索，用中文搜 PubMed

应用&插件

Zotero 插件浏览器插件 Mac 客户端 Windows 客户端微信小程序

定价

高级版会员购买积分包购买API积分包

服务

文献检索文档翻译深度研究 API 文档 MCP 服务

关于我们

关于 Suppr 公司介绍联系我们用户协议隐私条款

关注我们

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

粤ICP备2023148730 号-1Suppr @ 2026

OBJECTIVES

Synthetic datasets are artificially manufactured based on real health systems data but do not contain real patient information. We sought to validate the use of synthetic data in stroke and cancer research by conducting a comparison study of cancer patients with ischemic stroke to non-cancer patients with ischemic stroke.

DESIGN

retrospective cohort study.

SETTING

We used synthetic data generated by MDClone and compared it to its original source data (i.e. real patient data from the Ottawa Hospital Data Warehouse).

OUTCOME MEASURES

We compared key differences in demographics, treatment characteristics, length of stay, and costs between cancer patients with ischemic stroke and non-cancer patients with ischemic stroke. We used a binary, multivariable logistic regression model to identify risk factors for recurrent stroke in the cancer population.

RESULTS

Using synthetic data, we found cancer patients with ischemic stroke had a lower prevalence of hypertension (52.0% in the cancer cohort vs 57.7% in the non-cancer cohort, p<0.0001), and a higher prevalence of chronic obstructive pulmonary disease (COPD: 8.5% vs 4.7%, p<0.0001), prior ischemic stroke (1.7% vs 0.1%, p<0.001), and prior venous thromboembolism (VTE: 8.2% vs 1.5%, p<0.0001). They also had a longer length of stay (8 days [IQR 3-16] vs 6 days [IQR 3-13], p = 0.011), and higher costs associated with their stroke encounters: $11,498 (IQR $4,440 -$20,668) in the cancer cohort vs $8,084 (IQR $3,947 -$16,706) in the non-cancer cohort (p = 0.0061). A multivariable logistic regression model identified 5 predictors for recurrent ischemic stroke in the cancer cohort using synthetic data; 3 of the same predictors identified using real patient data with similar effect measures. Summary statistics between synthetic and original datasets did not significantly differ, other than slight differences in the distributions of frequencies for numeric data.

CONCLUSION

We demonstrated the utility of synthetic data in stroke and cancer research and provided key differences between cancer and non-cancer patients with ischemic stroke. Synthetic data is a powerful tool that can allow researchers to easily explore hypothesis generation, enable data sharing without privacy breaches, and ensure broad access to big data in a rapid, safe, and reliable fashion.

Synthetic data in cancer and cerebrovascular disease research: A novel approach to big data.

机构信息

School of Epidemiology and Public Health, University of Ottawa, Ottawa, Canada.

Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, Canada.

出版信息

PLoS One. 2024 Feb 7;19(2):e0295921. doi: 10.1371/journal.pone.0295921. eCollection 2024.

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

癌症和脑血管病研究中的合成数据：大数据的一种新方法。

Synthetic data in cancer and cerebrovascular disease research: A novel approach to big data.

机构信息

出版信息

OBJECTIVES

DESIGN

SETTING

OUTCOME MEASURES

RESULTS

CONCLUSION

目的

设计

地点

结果测量

结论

相似文献

引用本文的文献

本文引用的文献

癌症和脑血管病研究中的合成数据：大数据的一种新方法。

Synthetic data in cancer and cerebrovascular disease research: A novel approach to big data.

机构信息

出版信息

OBJECTIVES

DESIGN

SETTING

OUTCOME MEASURES

RESULTS

CONCLUSION

目的

设计

地点

结果测量

结论

相似文献

引用本文的文献

本文引用的文献