• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Confirming SPSS Results With ChatGPT-4 and o3-mini Models.使用ChatGPT-4和o3-mini模型验证SPSS结果。
Cureus. 2025 Apr 10;17(4):e82005. doi: 10.7759/cureus.82005. eCollection 2025 Apr.
2
A Comparative Evaluation of Statistical Product and Service Solutions (SPSS) and ChatGPT-4 in Statistical Analyses.统计产品与服务解决方案(SPSS)和ChatGPT-4在统计分析中的比较评估
Cureus. 2024 Oct 28;16(10):e72581. doi: 10.7759/cureus.72581. eCollection 2024 Oct.
3
ChatGPT for Univariate Statistics: Validation of AI-Assisted Data Analysis in Healthcare Research.用于单变量统计的ChatGPT:医疗保健研究中人工智能辅助数据分析的验证
J Med Internet Res. 2025 Feb 7;27:e63550. doi: 10.2196/63550.
4
Evaluating ChatGPT-4.0's data analytic proficiency in epidemiological studies: A comparative analysis with SAS, SPSS, and R.评估 ChatGPT-4.0 在流行病学研究中的数据分析能力:与 SAS、SPSS 和 R 的对比分析。
J Glob Health. 2024 Mar 29;14:04070. doi: 10.7189/jogh.14.04070.
5
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
6
Exploring the Ethical Implications of ChatGPT in Medical Education: Privacy, Accuracy, and Professional Integrity in a Cross-Sectional Survey.在横断面调查中探讨ChatGPT在医学教育中的伦理影响:隐私、准确性和职业操守
Cureus. 2024 Dec 17;16(12):e75895. doi: 10.7759/cureus.75895. eCollection 2024 Dec.
7
Evaluating ChatGPT in Qualitative Thematic Analysis With Human Researchers in the Japanese Clinical Context and Its Cultural Interpretation Challenges: Comparative Qualitative Study.在日本临床背景下与人类研究人员一起在定性主题分析中评估ChatGPT及其文化解释挑战:比较定性研究
J Med Internet Res. 2025 Apr 24;27:e71521. doi: 10.2196/71521.
8
Assessing the Accuracy of Information on Medication Abortion: A Comparative Analysis of ChatGPT and Google Bard AI.评估药物流产信息的准确性:ChatGPT与谷歌巴德人工智能的比较分析
Cureus. 2024 Jan 2;16(1):e51544. doi: 10.7759/cureus.51544. eCollection 2024 Jan.
9
Evaluating AI-generated patient education materials for spinal surgeries: Comparative analysis of readability and DISCERN quality across ChatGPT and deepseek models.评估用于脊柱手术的人工智能生成的患者教育材料:ChatGPT和DeepSeek模型之间可读性和DISCERN质量的比较分析。
Int J Med Inform. 2025 Jun;198:105871. doi: 10.1016/j.ijmedinf.2025.105871. Epub 2025 Mar 13.
10
Quantifying the Scope of Artificial Intelligence-Assisted Writing in Orthopaedic Medical Literature: An Analysis of Prevalence and Validation of AI-Detection Software.量化骨科医学文献中人工智能辅助写作的范围:人工智能检测软件的患病率分析与验证
J Am Acad Orthop Surg. 2025 Jan 1;33(1):42-50. doi: 10.5435/JAAOS-D-24-00084. Epub 2024 Nov 19.

本文引用的文献

1
Assessing the accuracy and efficiency of Chat GPT-4 Omni (GPT-4o) in biomedical statistics: Comparative study with traditional tools.评估Chat GPT-4 Omni(GPT-4o)在生物医学统计学中的准确性和效率:与传统工具的比较研究。
Saudi Med J. 2024 Dec;45(12):1383-1390. doi: 10.15537/smj.2024.45.12.20240454.
2
A Comparative Evaluation of Statistical Product and Service Solutions (SPSS) and ChatGPT-4 in Statistical Analyses.统计产品与服务解决方案(SPSS)和ChatGPT-4在统计分析中的比较评估
Cureus. 2024 Oct 28;16(10):e72581. doi: 10.7759/cureus.72581. eCollection 2024 Oct.
3
Evaluating ChatGPT-4.0's data analytic proficiency in epidemiological studies: A comparative analysis with SAS, SPSS, and R.评估 ChatGPT-4.0 在流行病学研究中的数据分析能力:与 SAS、SPSS 和 R 的对比分析。
J Glob Health. 2024 Mar 29;14:04070. doi: 10.7189/jogh.14.04070.
4
Impact of Applied Behavior Analysis on Autistic Children Target Behaviors: A Replication Using Repeated Measures.应用行为分析对自闭症儿童目标行为的影响:一项重复测量的复制研究。
Cureus. 2024 Feb 1;16(2):e53372. doi: 10.7759/cureus.53372. eCollection 2024 Feb.
5
Examining the Effects of Discrete Trials, Mass Trials, and Naturalistic Environment Training on Autistic Individuals Using Repeated Measures.使用重复测量法研究离散试验、大量试验和自然环境训练对自闭症个体的影响。
Cureus. 2024 Feb 1;16(2):e53371. doi: 10.7759/cureus.53371. eCollection 2024 Feb.
6
ChatGPT generates fake data set to support scientific hypothesis.ChatGPT生成虚假数据集以支持科学假设。
Nature. 2023 Nov;623(7989):895-896. doi: 10.1038/d41586-023-03635-w.
7
Efficacy and limitations of ChatGPT as a biostatistical problem-solving tool in medical education in Serbia: a descriptive study.在塞尔维亚的医学教育中,ChatGPT 作为一种生物统计学问题解决工具的功效和局限性:一项描述性研究。
J Educ Eval Health Prof. 2023;20:28. doi: 10.3352/jeehp.2023.20.28. Epub 2023 Oct 16.
8
ChatGPT: these are not hallucinations - they're fabrications and falsifications.ChatGPT:这些不是幻觉——它们是编造和伪造。
Schizophrenia (Heidelb). 2023 Aug 19;9(1):52. doi: 10.1038/s41537-023-00379-4.
9
Ethical Considerations of Using ChatGPT in Health Care.使用 ChatGPT 在医疗保健中的伦理考虑。
J Med Internet Res. 2023 Aug 11;25:e48009. doi: 10.2196/48009.
10
Artificial Intelligence Can Generate Fraudulent but Authentic-Looking Scientific Medical Articles: Pandora's Box Has Been Opened.人工智能可以生成虚假但看起来真实的科学医学文章:潘多拉的盒子已经被打开。
J Med Internet Res. 2023 May 31;25:e46924. doi: 10.2196/46924.

使用ChatGPT-4和o3-mini模型验证SPSS结果。

Confirming SPSS Results With ChatGPT-4 and o3-mini Models.

作者信息

Strale Frederick, Riddle Isaac, Geng Bowen, Oxford Blake, Kah Malia, Sherwin Robert

机构信息

Biostatistics, The Oxford Center, Brighton, USA.

Information Technology, The Oxford Center, Brighton, USA.

出版信息

Cureus. 2025 Apr 10;17(4):e82005. doi: 10.7759/cureus.82005. eCollection 2025 Apr.

DOI:10.7759/cureus.82005
PMID:40351918
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12065437/
Abstract

Background This research compared the simple and advanced statistical results of SPSS (IBM Corp., Armonk, NY, USA) with ChatGPT-4 and ChatGPT o3-mini (OpenAI, San Francisco, CA, USA) in statistical data output and interpretation with behavioral healthcare data. It evaluated their methodological approaches, quantitative performance, interpretability, adaptability, ethical considerations, and future trends.  Methods  Fourteen statistical analyses were conducted from two real datasets that produced peer-reviewed, published scientific articles in 2024. Descriptive statistics, Pearson r, multiple correlation with Pearson r, Spearman's rho, simple linear regression, one-sample t-test, paired t-test, two-independent sample t-test, multiple linear regression, one-way analysis of variance (ANOVA), repeated measures ANOVA, two-way (factorial) ANOVA, and multivariate ANOVA were computed. The two datasets adhered to a systematically structured timeframe, March 19, 2023, through June 11, 2023, and June 7, 2023, through July 7, 2023, thereby ensuring the integrity and temporal representativeness of the data gathering. The analyses were conducted by inputting the verbal (text) commands into ChatGPT-4 and ChatGPT o3-mini along with the relevant SPSS variables, which were copied and pasted from the SPSS datasets.  Results  The study found high concordance between SPSS and ChatGPT-4 in fundamental statistical analyses, such as measures of central tendency, variability, and simple Pearson and Spearman correlation analyses, where the results were nearly identical. ChatGPT-4 also closely matched SPSS in the three t-tests and simple linear regression, with minimal effect size variations. Discrepancies emerged in complex analyses. ChatGPT o3-mini showed inflated correlation values and significant results where none were expected, indicating computational deviations. ChatGPT o3-mini produced inflated coefficients in the multiple correlation and R-squared values in two-way ANOVA and multiple regression, suggesting differing assumptions. ChatGPT-4 and ChatGPT o3-mini produced identical F-statistics with repeated measures ANOVA but reported incorrect degrees of freedom (df) values. While ChatGPT-4 performed well in one-way ANOVA, it miscalculated degrees of freedom in multivariate ANOVA (MANOVA), leading to significant discrepancies. ChatGPT o3-mini also generated erroneous F-statistics in factorial ANOVA, highlighting the need for further optimization in multivariate statistical modeling. Conclusions This study underscored the rapid advancements in artificial intelligence (AI)-driven statistical analyses while highlighting areas that require further refinement. ChatGPT-4 accurately executed fundamental statistical tests, closely matching SPSS. However, its reliability diminished in more advanced statistical procedures, requiring further validation. ChatGPT o3-mini, while optimized for Science, Technology, Engineering, and Mathematics (STEM) applications, produced inconsistencies in correlation and multivariate analyses, limiting its dependability for complex research applications. Ensuring its alignment with established statistical methodologies will be essential for widespread scientific research adoption as AI evolves.

摘要

背景 本研究将SPSS(美国国际商业机器公司,纽约州阿蒙克)的简单和高级统计结果与ChatGPT-4以及ChatGPT o3-mini(美国加利福尼亚州旧金山OpenAI公司)在行为健康护理数据的统计数据输出和解释方面进行了比较。研究评估了它们的方法学途径、定量性能、可解释性、适应性、伦理考量以及未来趋势。

方法 从两个真实数据集进行了14项统计分析,这些数据集产生了在2024年经过同行评审并发表的科学文章。计算了描述性统计、皮尔逊相关系数r、与皮尔逊相关系数r的多重相关、斯皮尔曼等级相关系数、简单线性回归、单样本t检验、配对t检验、两独立样本t检验、多重线性回归、单因素方差分析、重复测量方差分析、双因素(析因)方差分析以及多变量方差分析。这两个数据集遵循系统结构化的时间框架,即2023年3月19日至2023年6月11日以及2023年6月7日至2023年7月7日,从而确保了数据收集的完整性和时间代表性。通过将语言(文本)命令与相关的SPSS变量一起输入ChatGPT-4和ChatGPT o3-mini进行分析,这些变量是从SPSS数据集中复制粘贴而来的。

结果 研究发现,在基本统计分析中,如集中趋势测量、变异性以及简单的皮尔逊和斯皮尔曼相关分析,SPSS与ChatGPT-4之间具有高度一致性,结果几乎相同。ChatGPT-4在三个t检验和简单线性回归中也与SPSS紧密匹配,效应大小差异最小。在复杂分析中出现了差异。ChatGPT o3-mini显示出相关性值膨胀以及在无预期显著结果的地方出现了显著结果,表明存在计算偏差。ChatGPT o3-mini在多重相关中产生了膨胀的系数,在双因素方差分析和多重回归中产生了膨胀的R平方值,表明假设不同。ChatGPT-4和ChatGPT o3-mini在重复测量方差分析中产生了相同的F统计量,但报告的自由度(df)值不正确。虽然ChatGPT-4在单因素方差分析中表现良好,但在多变量方差分析(MANOVA)中错误计算了自由度,导致显著差异。ChatGPT o3-mini在析因方差分析中也产生了错误的F统计量,凸显了在多变量统计建模中进一步优化的必要性。

结论 本研究强调了人工智能(AI)驱动的统计分析的快速进展,同时突出了需要进一步完善的领域。ChatGPT-4准确执行了基本统计测试,与SPSS紧密匹配。然而,在更高级的统计程序中其可靠性降低,需要进一步验证。ChatGPT o3-mini虽然针对科学、技术、工程和数学(STEM)应用进行了优化,但在相关性和多变量分析中产生了不一致性,限制了其在复杂研究应用中的可靠性。随着AI的发展,确保其与既定统计方法保持一致对于广泛的科学研究应用至关重要。