Suppr超能文献

在随机对照试验的基线表中自动检测过分散和欠分散。

Automated detection of over- and under-dispersion in baseline tables in randomised controlled trials.

机构信息

Australian Centre for Health Services Innovation & Centre for Healthcare Transformation, Queensland University of Technology, Kelvin Grove, Queensland, 4059, Australia.

出版信息

F1000Res. 2023 May 30;11:783. doi: 10.12688/f1000research.123002.2. eCollection 2022.

Abstract

: Papers describing the results of a randomised trial should include a baseline table that compares the characteristics of randomised groups. Researchers who fraudulently generate trials often unwittingly create baseline tables that are implausibly similar (under-dispersed) or have large differences between groups (over-dispersed). I aimed to create an automated algorithm to screen for under- and over-dispersion in the baseline tables of randomised trials. : Using a cross-sectional study I examined 2,245 randomised controlled trials published in health and medical journals on . I estimated the probability that a trial's baseline summary statistics were under- or over-dispersed using a Bayesian model that examined the distribution of t-statistics for the between-group differences, and compared this with an expected distribution without dispersion. I used a simulation study to test the ability of the model to find under- or over-dispersion and compared its performance with an existing test of dispersion based on a uniform test of p-values. My model combined categorical and continuous summary statistics, whereas the uniform test used only continuous statistics. : The algorithm had a relatively good accuracy for extracting the data from baseline tables, matching well on the size of the tables and sample size. Using t-statistics in the Bayesian model out-performed the uniform test of p-values, which had many false positives for skewed, categorical and rounded data that were not under- or over-dispersed. For trials published on , some tables appeared under- or over-dispersed because they had an atypical presentation or had reporting errors. Some trials flagged as under-dispersed had groups with strikingly similar summary statistics. : Automated screening for fraud of all submitted trials is challenging due to the widely varying presentation of baseline tables. The Bayesian model could be useful in targeted checks of suspected trials or authors.

摘要

描述随机试验结果的论文应包括一个基线表格,该表格比较随机分组的特征。伪造试验的研究人员常常在不知不觉中创建了看似相似(欠分散)或组间差异较大(过分散)的基线表格。我旨在创建一种自动算法,以筛选随机试验基线表格中的欠分散和过分散。

我使用横断面研究,检查了在健康和医学期刊上发表的 2245 项随机对照试验。我使用贝叶斯模型估计试验基线汇总统计数据的欠分散或过分散的概率,该模型检查了组间差异的 t 统计量的分布,并将其与无分散的预期分布进行了比较。我使用模拟研究来测试该模型发现欠分散或过分散的能力,并将其性能与基于均匀 p 值检验的现有分散检验进行了比较。我的模型结合了分类和连续汇总统计数据,而均匀检验仅使用连续统计数据。

该算法从基线表格中提取数据的准确性相对较高,与表格大小和样本量匹配良好。在贝叶斯模型中使用 t 统计量优于均匀 p 值检验,后者对偏态、分类和四舍五入数据存在许多假阳性,这些数据不是欠分散或过分散。对于在 上发表的试验,一些表格看起来欠分散或过分散,是因为它们的呈现方式不典型或存在报告错误。一些被标记为欠分散的试验具有非常相似的汇总统计数据的组。

由于基线表格的呈现方式差异很大,对所有提交的试验进行欺诈自动筛选具有挑战性。贝叶斯模型可以在针对可疑试验或作者的有针对性检查中发挥作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a45/10285430/33a12d3efb7d/f1000research-11-148868-g0000.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验