Nakatsuka Nathan, Adler Drew, Jiang Longda, Hartman Austin, Cheng Evan, Klann Eric, Satija Rahul
New York Genome Center, New York, NY, USA.
Center for Genomics and Systems Biology, New York University, New York, NY, USA.
Nat Commun. 2025 Aug 12;16(1):7436. doi: 10.1038/s41467-025-62579-z.
False positive claims of differentially expressed genes (DEGs) in scRNA-seq studies are of substantial concern. We found that DEGs from individual Parkinson's (PD), Huntington's (HD), and COVID-19 datasets had moderate predictive power for case-control status of other datasets, but DEGs from Alzheimer's (AD) and Schizophrenia (SCZ) datasets had poor predictive power. We developed a non-parametric meta-analysis method, SumRank, based on reproducibility of relative differential expression ranks across datasets, and found DEGs with improved predictive power. Specificity and sensitivity of these genes were substantially higher than those discovered by dataset merging and inverse variance weighted p-value aggregation methods. Up-regulated DEGs implicated chaperone-mediated protein processing in PD glia and lipid transport in AD and PD microglia, while down-regulated DEGs were in glutamatergic processes in AD astrocytes and excitatory neurons and synaptic functioning in HD FOXP2 neurons. Lastly, we evaluate factors influencing reproducibility of individual studies as a prospective guide for experimental design.
单细胞RNA测序(scRNA-seq)研究中差异表达基因(DEG)的假阳性结果备受关注。我们发现,来自帕金森病(PD)、亨廷顿舞蹈病(HD)和新冠病毒病(COVID-19)单个数据集的DEG对其他数据集的病例对照状态具有中等预测能力,但来自阿尔茨海默病(AD)和精神分裂症(SCZ)数据集的DEG预测能力较差。我们基于跨数据集相对差异表达排名的可重复性,开发了一种非参数元分析方法SumRank,并发现了预测能力有所提高的DEG。这些基因的特异性和敏感性显著高于通过数据集合并和逆方差加权p值汇总方法发现的基因。上调的DEG涉及PD神经胶质细胞中伴侣介导的蛋白质加工以及AD和PD小胶质细胞中的脂质转运,而下调的DEG则存在于AD星形胶质细胞和兴奋性神经元的谷氨酸能过程以及HD FOXP2神经元的突触功能中。最后,我们评估了影响个体研究可重复性的因素,作为实验设计的前瞻性指南。