Suppr超能文献

基于迁移学习的细胞周期位置的通用预测。

Universal prediction of cell-cycle position using transfer learning.

机构信息

Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, USA.

Department of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, USA.

出版信息

Genome Biol. 2022 Jan 31;23(1):41. doi: 10.1186/s13059-021-02581-y.

Abstract

BACKGROUND

The cell cycle is a highly conserved, continuous process which controls faithful replication and division of cells. Single-cell technologies have enabled increasingly precise measurements of the cell cycle both as a biological process of interest and as a possible confounding factor. Despite its importance and conservation, there is no universally applicable approach to infer position in the cell cycle with high-resolution from single-cell RNA-seq data.

RESULTS

Here, we present tricycle, an R/Bioconductor package, to address this challenge by leveraging key features of the biology of the cell cycle, the mathematical properties of principal component analysis of periodic functions, and the use of transfer learning. We estimate a cell-cycle embedding using a fixed reference dataset and project new data into this reference embedding, an approach that overcomes key limitations of learning a dataset-dependent embedding. Tricycle then predicts a cell-specific position in the cell cycle based on the data projection. The accuracy of tricycle compares favorably to gold-standard experimental assays, which generally require specialized measurements in specifically constructed in vitro systems. Using internal controls which are available for any dataset, we show that tricycle predictions generalize to datasets with multiple cell types, across tissues, species, and even sequencing assays.

CONCLUSIONS

Tricycle generalizes across datasets and is highly scalable and applicable to atlas-level single-cell RNA-seq data.

摘要

背景

细胞周期是一个高度保守的连续过程,它控制着细胞的忠实复制和分裂。单细胞技术使人们能够越来越精确地测量细胞周期,无论是作为一个感兴趣的生物学过程,还是作为一个可能的混杂因素。尽管它很重要且具有保守性,但目前还没有一种普遍适用的方法能够从单细胞 RNA-seq 数据中以高分辨率推断细胞周期的位置。

结果

在这里,我们提出了 tricycle,这是一个 R/Bioconductor 包,通过利用细胞周期生物学的关键特征、周期性函数主成分分析的数学性质以及迁移学习的使用,来解决这一挑战。我们使用固定的参考数据集来估计细胞周期嵌入,并将新数据投影到这个参考嵌入中,这种方法克服了学习数据集相关嵌入的关键限制。然后,tricycle 根据数据投影来预测细胞在细胞周期中的特定位置。tricycle 的准确性与黄金标准的实验检测方法相媲美,后者通常需要在专门构建的体外系统中进行专门的测量。使用任何数据集都可用的内部对照,我们表明 tricycle 的预测可以推广到具有多种细胞类型的数据集,包括组织、物种,甚至是测序检测。

结论

tricycle 可以跨数据集推广,并且具有高度的可扩展性和适用性,可以应用于图谱级别的单细胞 RNA-seq 数据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7dcf/8802487/ffb1f6fdb2ee/13059_2021_2581_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验