Neuroinformatics Department, Donders Centre for Neuroscience, Radboud University Nijmegen, Heyendaalseweg 135, 6525, AJ, Nijmegen, the Netherlands.
Institute of Neuroscience and Medicine (INM-6) and Institute for Advanced Simulation (IAS-6) and JARA BRAIN Institute I, Jülich Research Centre, Wilhelm-Johnen-Strasse, 52425, Jülich, Germany.
Neuroinformatics. 2020 Oct;18(4):611-626. doi: 10.1007/s12021-020-09471-x.
Reconstructing brain connectivity at sufficient resolution for computational models designed to study the biophysical mechanisms underlying cognitive processes is extremely challenging. For such a purpose, a mesoconnectome that includes laminar and cell-class specificity would be a major step forward. We analyzed the ability of gene expression patterns to predict cell-class and layer-specific projection patterns and assessed the functional annotations of the most predictive groups of genes. To achieve our goal we used publicly available volumetric gene expression and connectivity data and we trained computational models to learn and predict cell-class and layer-specific axonal projections using gene expression data. Predictions were done in two ways, namely predicting projection strengths using the expression of individual genes and using the co-expression of genes organized in spatial modules, as well as predicting binary forms of projection. For predicting the strength of projections, we found that ridge (L2-regularized) regression had the highest cross-validated accuracy with a median r score of 0.54 which corresponded for binarized predictions to a median area under the ROC value of 0.89. Next, we identified 200 spatial gene modules using a dictionary learning and sparse coding approach. We found that these modules yielded predictions of comparable accuracy, with a median r score of 0.51. Finally, a gene ontology enrichment analysis of the most predictive gene groups resulted in significant annotations related to postsynaptic function. Taken together, we have demonstrated a prediction workflow that can be used to perform multimodal data integration to improve the accuracy of the predicted mesoconnectome and support other neuroscience use cases.
重建足以用于研究认知过程背后生物物理机制的计算模型的大脑连接,是一项极具挑战性的任务。为此,一个包含分层和细胞类型特异性的介观连接组将是一个重大的进展。我们分析了基因表达模式预测细胞类型和层特异性投射模式的能力,并评估了最具预测性的基因组的功能注释。为了实现我们的目标,我们使用了公开的容积基因表达和连接数据,并训练了计算模型,使用基因表达数据学习和预测细胞类型和层特异性轴突投射。我们通过两种方式进行预测,即使用单个基因的表达预测投射强度,以及使用组织在空间模块中的基因的共表达进行预测,以及预测投射的二进制形式。为了预测投射的强度,我们发现岭回归(L2-正则化)具有最高的交叉验证准确性,中位数 r 得分为 0.54,这对应于二值化预测的中位数 ROC 值为 0.89。接下来,我们使用字典学习和稀疏编码方法识别了 200 个空间基因模块。我们发现这些模块产生了具有可比性准确性的预测,中位数 r 得分为 0.51。最后,对最具预测性基因组的基因本体论富集分析导致了与突触后功能相关的显著注释。总的来说,我们已经展示了一种可以用于执行多模态数据集成的预测工作流程,以提高预测的介观连接组的准确性,并支持其他神经科学用例。