• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

生物学中的计数数据——数据转换还是模型重构?

Count data in biology-Data transformation or model reformation?

作者信息

St-Pierre Anne P, Shikon Violaine, Schneider David C

机构信息

Department of Ocean Sciences Ocean Sciences Centre Memorial University of Newfoundland St. John's NL Canada.

Department of Biology Memorial University of Newfoundland St. John's NL Canada.

出版信息

Ecol Evol. 2018 Feb 16;8(6):3077-3085. doi: 10.1002/ece3.3807. eCollection 2018 Mar.

DOI:10.1002/ece3.3807
PMID:29607007
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5869353/
Abstract

Statistical analyses are an integral component of scientific research, and for decades, biologists have applied transformations to data to meet the normal error assumptions for and tests. Over the years, there has been a movement from data transformation toward model reformation-the use of non-normal error structures within the framework of the generalized linear model (GLM). The principal advantage of model reformation is that parameters are estimated on the original, rather than the transformed scale. However, data transformation has been shown to give better control over type I error, for simulated data with known error structures. We conducted a literature review of statistical textbooks directed toward biologists and of journal articles published in the primary literature to determine temporal trends in both the text recommendations and the practice in the refereed literature over the past 35 years. In this review, a trend of increasing use of reformation in the primary literature was evident, moving from no use of reformation before 1996 to >50% of the articles reviewed applying GLM after 2006. However, no such trend was observed in the recommendations in statistical textbooks. We then undertook 12 analyses based on published datasets in which we compared the type I error estimates, residual plot diagnostics, and coefficients yielded by analyses using square root transformations, log transformations, and the GLM. All analyses yielded acceptable residual versus fit plots and had similar -values within each analysis, but as expected, the coefficient estimates differed substantially. Furthermore, no consensus could be found in the literature regarding a procedure to back-transform the coefficient estimates obtained from linear models performed on transformed datasets. This lack of consistency among coefficient estimates constitutes a major argument for model reformation over data transformation in biology.

摘要

统计分析是科学研究不可或缺的一部分,几十年来,生物学家一直对数据进行变换,以满足t检验和F检验的正态误差假设。多年来,出现了从数据变换向模型改革的转变,即在广义线性模型(GLM)框架内使用非正态误差结构。模型改革的主要优点是在原始尺度而非变换后的尺度上估计参数。然而,对于具有已知误差结构的模拟数据,数据变换已被证明能更好地控制I型错误。我们对面向生物学家的统计教科书以及发表在主流文献中的期刊文章进行了文献综述,以确定过去35年中文本推荐和经同行评审文献中的实践的时间趋势。在这项综述中,主流文献中模型改革使用增加的趋势很明显,从1996年前不使用模型改革到2006年后超过50%的被审查文章应用广义线性模型。然而,在统计教科书的推荐中未观察到这种趋势。然后,我们基于已发表的数据集进行了12项分析,在这些分析中,我们比较了使用平方根变换、对数变换和广义线性模型进行分析所得到的I型错误估计、残差图诊断和系数。所有分析都产生了可接受的残差与拟合图,并且在每次分析中具有相似的P值,但正如预期的那样,系数估计有很大差异。此外,在文献中找不到关于对从变换后数据集上执行的线性模型获得的系数估计进行反变换的程序的共识。系数估计之间缺乏一致性构成了生物学中模型改革优于数据变换的一个主要论据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da60/5869353/d6169585be78/ECE3-8-3077-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da60/5869353/745456f7a4e1/ECE3-8-3077-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da60/5869353/612d68ad0dcf/ECE3-8-3077-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da60/5869353/31c791710ef2/ECE3-8-3077-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da60/5869353/d6169585be78/ECE3-8-3077-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da60/5869353/745456f7a4e1/ECE3-8-3077-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da60/5869353/612d68ad0dcf/ECE3-8-3077-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da60/5869353/31c791710ef2/ECE3-8-3077-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da60/5869353/d6169585be78/ECE3-8-3077-g004.jpg

相似文献

1
Count data in biology-Data transformation or model reformation?生物学中的计数数据——数据转换还是模型重构?
Ecol Evol. 2018 Feb 16;8(6):3077-3085. doi: 10.1002/ece3.3807. eCollection 2018 Mar.
2
A comparison of methods to handle skew distributed cost variables in the analysis of the resource consumption in schizophrenia treatment.精神分裂症治疗资源消耗分析中处理偏态分布成本变量方法的比较。
J Ment Health Policy Econ. 2002 Mar;5(1):21-31.
3
Evaluating the double Poisson generalized linear model.评估双泊松广义线性模型。
Accid Anal Prev. 2013 Oct;59:497-505. doi: 10.1016/j.aap.2013.07.017. Epub 2013 Jul 21.
4
Ecotoxicology is not normal: A comparison of statistical approaches for analysis of count and proportion data in ecotoxicology.生态毒理学不正常:比较生态毒理学中计数和比例数据的统计分析方法。
Environ Sci Pollut Res Int. 2015 Sep;22(18):13990-9. doi: 10.1007/s11356-015-4579-3. Epub 2015 May 9.
5
Accounting for Non-Gaussian Sources of Spatial Correlation in Parametric Functional Magnetic Resonance Imaging Paradigms II: A Method to Obtain First-Level Analysis Residuals with Uniform and Gaussian Spatial Autocorrelation Function and Independent and Identically Distributed Time-Series.在参数功能磁共振成像范式中考虑空间相关性的非高斯源 II:一种获得具有均匀和高斯空间自相关函数以及独立同分布时间序列的一级分析残差的方法。
Brain Connect. 2018 Feb;8(1):10-21. doi: 10.1089/brain.2017.0522.
6
[Meta-analysis of the Italian studies on short-term effects of air pollution].[意大利关于空气污染短期影响研究的荟萃分析]
Epidemiol Prev. 2001 Mar-Apr;25(2 Suppl):1-71.
7
Transformations of count data for tests of interaction in factorial and split-plot experiments.析因试验和裂区试验中交互作用检验的计数数据变换
J Econ Entomol. 2006 Jun;99(3):1002-6. doi: 10.1603/0022-0493-99.3.1002.
8
Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification头部损伤的转化代谢组学:基于体外核磁共振波谱的代谢物定量分析探索脑代谢功能障碍
9
Poisson Counts, Square Root Transformation and Small Area Estimation: Square Root Transformation.泊松计数、平方根变换与小区域估计:平方根变换
Sankhya B (2008). 2022;84(2):449-471. doi: 10.1007/s13571-021-00269-8. Epub 2021 Oct 11.
10
Impact of the 1990 Hong Kong legislation for restriction on sulfur content in fuel.1990年香港燃料含硫量限制立法的影响。
Res Rep Health Eff Inst. 2012 Aug(170):5-91.

引用本文的文献

1
Generalized linear modeling of flow cytometry data to analyze immune responses in tuberculosis vaccine research.用于分析结核病疫苗研究中免疫反应的流式细胞术数据广义线性建模
NPJ Syst Biol Appl. 2025 Aug 10;11(1):90. doi: 10.1038/s41540-025-00572-4.
2
Closing the multichannel gap through computational reconstruction of interaction in super-resolution microscopy.通过超分辨率显微镜中相互作用的计算重建来弥合多通道差距。
Patterns (N Y). 2025 Mar 27;6(5):101181. doi: 10.1016/j.patter.2025.101181. eCollection 2025 May 9.
3
Statistical data transformation in agrarian sciences for variance analysis: a systematic review.

本文引用的文献

1
To transform or not to transform: using generalized linear mixed models to analyse reaction time data.转换还是不转换:使用广义线性混合模型分析反应时间数据。
Front Psychol. 2015 Aug 7;6:1171. doi: 10.3389/fpsyg.2015.01171. eCollection 2015.
2
Acute effects of removing large fish from a near-pristine coral reef.从近乎原始的珊瑚礁中移除大型鱼类的急性影响。
Mar Biol. 2010;157(12):2739-2750. doi: 10.1007/s00227-010-1533-2. Epub 2010 Aug 26.
3
Sociality, density-dependence and microclimates determine the persistence of populations suffering from a novel fungal disease, white-nose syndrome.
农业科学中用于方差分析的统计数据转换:系统评价。
F1000Res. 2024 Jul 12;13:459. doi: 10.12688/f1000research.144805.2. eCollection 2024.
4
A Novel One-Sample Mendelian Randomization Approach for Count-Type Outcomes That Is Robust to Correlated and Uncorrelated Pleiotropic Effects.一种针对计数型结果的新型单样本孟德尔随机化方法,该方法对相关和不相关的多效性效应均具有稳健性。
Genet Epidemiol. 2025 Jan;49(1):e22602. doi: 10.1002/gepi.22602. Epub 2024 Nov 5.
5
Genetic and phenotypic parameters for sexual precocity and parasite resistance traits in Nellore cattle.内罗尔牛的早熟和寄生虫抗性性状的遗传和表型参数。
J Appl Genet. 2023 Dec;64(4):797-807. doi: 10.1007/s13353-023-00781-9. Epub 2023 Sep 8.
6
Automated quantification and statistical assessment of proliferating cardiomyocyte rates in embryonic hearts.胚胎心脏增殖性心肌细胞比率的自动定量和统计评估。
Am J Physiol Heart Circ Physiol. 2023 Mar 1;324(3):H288-H292. doi: 10.1152/ajpheart.00483.2022. Epub 2022 Dec 23.
7
Compositional Dynamics of Gastrointestinal Tract Microbiomes Associated with Dietary Transition and Feeding Cessation in Lake Sturgeon Larvae.与湖鲟幼鱼饮食转变和摄食停止相关的胃肠道微生物群落组成动态
Microorganisms. 2022 Sep 19;10(9):1872. doi: 10.3390/microorganisms10091872.
8
Dataset for effects of the transition from dry forest to pasture on diversity and structure of bacterial communities in Northeastern Brazil.巴西东北部从干燥森林转变为牧场对细菌群落多样性和结构影响的数据集。
Data Brief. 2022 Jan 19;41:107842. doi: 10.1016/j.dib.2022.107842. eCollection 2022 Apr.
9
Adequate statistical modelling and data selection are essential when analysing abundance and diversity trends.在分析丰度和多样性趋势时,充分的统计建模和数据选择至关重要。
Nat Ecol Evol. 2021 May;5(5):592-594. doi: 10.1038/s41559-021-01427-x. Epub 2021 Apr 5.
10
Functional Redundancy in bird community decreases with riparian forest width reduction.鸟类群落的功能冗余随着河岸森林宽度的减少而降低。
Ecol Evol. 2018 Oct 11;8(21):10395-10408. doi: 10.1002/ece3.4448. eCollection 2018 Nov.
社会性、密度依赖和小气候决定了遭受新型真菌病——白鼻综合征影响的种群的持续存在。
Ecol Lett. 2012 Sep;15(9):1050-7. doi: 10.1111/j.1461-0248.2012.01829.x. Epub 2012 Jul 2.
4
The arcsine is asinine: the analysis of proportions in ecology.反正弦法很愚蠢:生态学中的比例分析。
Ecology. 2011 Jan;92(1):3-10. doi: 10.1890/10-0340.1.
5
The use of transformations.变换的使用。
Biometrics. 1947 Mar;3(1):39-52.
6
Some consequences when the assumptions for the analysis of variance are not satisfied.当方差分析的假设不满足时的一些后果。
Biometrics. 1947 Mar;3(1):22-38.
7
The assumptions underlying the analysis of variance.方差分析的基本假设。
Biometrics. 1947 Mar;3(1):1-21.
8
Generalized linear mixed models: a practical guide for ecology and evolution.广义线性混合模型:生态学与进化实用指南
Trends Ecol Evol. 2009 Mar;24(3):127-35. doi: 10.1016/j.tree.2008.10.008.
9
Model selection and logarithmic transformation in allometric analysis.异速生长分析中的模型选择与对数变换
Physiol Biochem Zool. 2008 Jul-Aug;81(4):496-507. doi: 10.1086/589110.