Cahan Amos, Anand Vibha
IBM T.J Watson Research Center, Yorktown Height, NY, United States of America.
PLoS One. 2017 Nov 6;12(11):e0185886. doi: 10.1371/journal.pone.0185886. eCollection 2017.
ClinicalTrials.gov is valuable for aggregate-level analysis of trials. The recently published final rule aims to improve reporting of trial results. We aimed to assess variability in ClinicalTirals.gov records reporting participants' baseline measures.
The September 2015 edition of the database for Aggregate Analysis of ClinicalTrials.gov (AACT), was used in this study. To date, AACT contains 186,941 trials of which 16,660 trials reporting baseline (participant) measures were analyzed. We also analyzed a subset of 13,818 Highly Likely Applicable Clinical Trials (HLACT), for which reporting of results is likely mandatory and compared a random sample of 30 trial records to their journal articles. We report counts for each mandatory baseline measure and variability reporting in their formats. The AACT dataset contains 8,161 baseline measures with 1206 unique measurement units. However, of these 6,940 (85%) variables appear only once in the dataset. Age and Gender are reported using many different formats (178 and 49 respectively). "Age" as the variable name is reported in 60 different formats. HLACT subset reports measures using 3,931 variables. The most frequent Age format (i.e. mean (years) ± sd) is found in only 45% of trials. Overall only 4 baseline measures (Region of Enrollment, Age, Number of Participants, and Gender) are reported by > 10% of trials. Discrepancies are found in both the types and formats of ClinicalTrials.gov records and their corresponding journal articles. On average, journal articles include twice the number of baseline measures (13.6±7.1 (sd) vs. 6.6±7.6) when compared to the ClinicalTrials.gov records that report any results.
We found marked variability in baseline measures reporting. This is not addressed by the final rule. To support secondary use of ClinicalTrials.gov, a uniform format for baseline measures reporting is warranted.
ClinicalTrials.gov对于试验的汇总分析很有价值。最近发布的最终规则旨在改善试验结果的报告。我们旨在评估ClinicalTrials.gov记录中报告参与者基线测量值的变异性。
本研究使用了2015年9月版的ClinicalTrials.gov汇总分析数据库(AACT)。截至目前,AACT包含186,941项试验,其中对16,660项报告基线(参与者)测量值的试验进行了分析。我们还分析了13,818项极有可能适用的临床试验(HLACT)的子集,其结果报告可能是强制性的,并将30项试验记录的随机样本与其期刊文章进行了比较。我们报告了每个强制性基线测量值的计数及其格式的变异性。AACT数据集包含8,161个基线测量值,有1206个独特的测量单位。然而,在这些变量中,6,940个(85%)在数据集中仅出现一次。年龄和性别使用许多不同的格式报告(分别为178种和49种)。“年龄”作为变量名以60种不同的格式报告。HLACT子集使用3,931个变量报告测量值。最常见的年龄格式(即平均(岁)±标准差)仅在45%的试验中出现。总体而言,只有4项基线测量值(入组地区、年龄、参与者数量和性别)在超过10%的试验中被报告。在ClinicalTrials.gov记录及其相应期刊文章的类型和格式中均发现了差异。与报告任何结果的ClinicalTrials.gov记录相比,期刊文章平均包含的基线测量值数量是其两倍(13.6±7.1(标准差)对6.6±7.6)。
我们发现基线测量值报告存在显著变异性。最终规则未解决此问题。为支持ClinicalTrials.gov的二次使用,有必要采用统一的基线测量值报告格式。