Boldina Galina, Fogel Paul, Rocher Corinne, Bettembourg Charles, Luta George, Augé Franck
Sanofi, R&D Translational Sciences France, Bioinformatics, Sanofi, F-91385 Chilly-Mazarin Cedex, France.
Consultant, F-75006 Paris, France.
Bioinformatics. 2022 Jan 27;38(4):1015-1021. doi: 10.1093/bioinformatics/btab773.
Molecular signatures are critical for inferring the proportions of cell types from bulk transcriptomics data. However, the identification of these signatures is based on a methodology that relies on prior biological knowledge of the cell types being studied. When working with less known biological material, a data-driven approach is required to uncover the underlying classes and generate ad hoc signatures from healthy or pathogenic tissue.
We present a new approach, A2Sign: Agnostic Algorithms for Signatures, based on a non-negative tensor factorization (NTF) strategy that allows us to identify cell-type-specific molecular signatures, greatly reduce collinearities and also account for inter-individual variability. We propose a global framework that can be applied to uncover molecular signatures for cell-type deconvolution in arbitrary tissues using bulk transcriptome data. We also present two new molecular signatures for deconvolution of up to 16 immune cell types using microarray or RNA-seq data.
All steps of our analysis were implemented in annotated Python notebooks (https://github.com/paulfogel/A2SIGN). To perform NTF, we used the NMTF package, which can be downloaded using Python pip install.
Supplementary data are available at Bioinformatics online.
分子特征对于从批量转录组数据推断细胞类型比例至关重要。然而,这些特征的识别基于一种依赖于所研究细胞类型的先验生物学知识的方法。当处理鲜为人知的生物材料时,需要一种数据驱动的方法来揭示潜在类别并从健康或致病组织中生成特定的特征。
我们提出了一种新方法A2Sign:特征的不可知算法,基于非负张量分解(NTF)策略,使我们能够识别细胞类型特异性分子特征,大大减少共线性并考虑个体间变异性。我们提出了一个全局框架,可应用于使用批量转录组数据揭示任意组织中细胞类型反卷积的分子特征。我们还提出了两种新的分子特征,用于使用微阵列或RNA测序数据对多达16种免疫细胞类型进行反卷积。
我们分析的所有步骤都在带注释的Python笔记本(https://github.com/paulfogel/A2SIGN)中实现。为了执行NTF,我们使用了NMTF包,可以使用Python pip install下载。
补充数据可在《生物信息学》在线获取。