Pennell Michael L, Wheeler Matthew W, Auerbach Scott S
Division of Biostatistics, College of Public Health, The Ohio State University, Columbus, Ohio, USA.
Biostatistics and Computational Biology Branch, Division of Intramural Research, NIEHS, Research Triangle Park, North Carolina, USA.
Environmetrics. 2024 Nov;35(7). doi: 10.1002/env.2880. Epub 2024 Aug 26.
With the advent of new alternative methods for rapid toxicity screening of chemicals comes the need for new statistical methodologies which appropriately synthesize the large amount of data collected. For example, transcriptomic assays can be used to assess the impact of a chemical on thousands of genes, but current approaches to analyzing the data treat each gene separately and do not allow sharing of information among genes within pathways. Furthermore, the methods employed are fully parametric and do not account for changes in distribution shape that may occur at high exposure levels. To address the limitations of these methods, we propose Constrained Logistic Density Regression (COLDER) to model expression data from different genes simultaneously. Under COLDER, the dose-response function for each gene is assigned a prior via a discrete logistic stick-breaking process (LSBP) whose weights depend on gene-level characteristics (e.g., pathway membership) and atoms consist of different dose-response functions subject to a shape constraint that ensures biological plausibility. The posterior distribution for the benchmark dose among genes within the same pathways can be estimated directly from the model, which is another advantage over current methods. The ability of COLDER to predict gene-level dose-response is evaluated in a simulation study and the method is illustrated with data from a National Toxicology Program study of Aflatoxin B1.
随着用于化学品快速毒性筛选的新替代方法的出现,需要新的统计方法来恰当地综合所收集的大量数据。例如,转录组分析可用于评估一种化学品对数千个基因的影响,但目前分析数据的方法是分别处理每个基因,不允许在通路内的基因之间共享信息。此外,所采用的方法是完全参数化的,没有考虑在高暴露水平下可能出现的分布形状变化。为了解决这些方法的局限性,我们提出了约束逻辑密度回归(COLDER)来同时对来自不同基因的表达数据进行建模。在COLDER方法下,通过离散逻辑折断过程(LSBP)为每个基因的剂量反应函数分配一个先验,其权重取决于基因水平特征(例如,通路成员身份),并且原子由受形状约束的不同剂量反应函数组成,该形状约束确保生物学合理性。同一通路内基因间基准剂量的后验分布可直接从模型中估计,这是相对于当前方法的另一个优势。在一项模拟研究中评估了COLDER预测基因水平剂量反应的能力,并用来自国家毒理学计划黄曲霉毒素B1研究的数据对该方法进行了说明。