Chen Ziwei, Zhang Bingwei, Gong Fuzhou, Wan Lin, Ma Liang
Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, United States.
Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China.
Front Genet. 2023 Mar 8;14:1110899. doi: 10.3389/fgene.2023.1110899. eCollection 2023.
Robust Principal Component Analysis (RPCA) offers a powerful tool for recovering a low-rank matrix from highly corrupted data, with growing applications in computational biology. Biological processes commonly form intrinsic hierarchical structures, such as tree structures of cell development trajectories and tumor evolutionary history. The rapid development of single-cell sequencing (SCS) technology calls for the recovery of embedded tree structures from noisy and heterogeneous SCS data. In this study, we propose RobustTree, a unified framework to reconstruct the inherent topological structure underlying high-dimensional data with noise. By extending RPCA to handle tree structure optimization, RobustTree leverages data denoising, clustering, and tree structure reconstruction. It solves the tree optimization problem with an adaptive parameter selection scheme that we proposed. In addition to recovering real datasets, RobustTree can reconstruct continuous topological structure and discrete-state topological structure of underlying SCS data. We apply RobustTree on multiple synthetic and real datasets and demonstrate its high accuracy and robustness when analyzing high-noise SCS data with embedded complex structures. The code is available at https://github.com/ucasdp/RobustTree.
鲁棒主成分分析(RPCA)为从高度损坏的数据中恢复低秩矩阵提供了一个强大的工具,在计算生物学中的应用越来越广泛。生物过程通常会形成内在的层次结构,例如细胞发育轨迹和肿瘤进化历史的树状结构。单细胞测序(SCS)技术的快速发展要求从嘈杂且异质的SCS数据中恢复嵌入的树状结构。在本研究中,我们提出了RobustTree,这是一个统一的框架,用于重建具有噪声的高维数据背后的固有拓扑结构。通过扩展RPCA来处理树结构优化,RobustTree利用了数据去噪、聚类和树结构重建。它通过我们提出的自适应参数选择方案解决了树优化问题。除了恢复真实数据集外,RobustTree还可以重建底层SCS数据的连续拓扑结构和离散状态拓扑结构。我们将RobustTree应用于多个合成和真实数据集,并展示了其在分析具有嵌入复杂结构的高噪声SCS数据时的高精度和鲁棒性。代码可在https://github.com/ucasdp/RobustTree获取。