Suppr超能文献

利用 REVEAL:SingleCell™ 快速评估人类疾病和障碍的单细胞靶标。

Rapid single cell evaluation of human disease and disorder targets using REVEAL: SingleCell™.

机构信息

Informatics & Predictive Sciences, Bristol Myers Squibb, Princeton, NJ, 08648, USA.

Paradigm4, Inc., Suite 360, 281 Winter Street, Waltham, MA, 02451, USA.

出版信息

BMC Genomics. 2021 Jan 6;22(1):5. doi: 10.1186/s12864-020-07300-8.

Abstract

BACKGROUND

Single-cell (sc) sequencing performs unbiased profiling of individual cells and enables evaluation of less prevalent cellular populations, often missed using bulk sequencing. However, the scale and the complexity of the sc datasets poses a great challenge in its utility and this problem is further exacerbated when working with larger datasets typically generated by consortium efforts. As the scale of single cell datasets continues to increase exponentially, there is an unmet technological need to develop database platforms that can evaluate key biological hypotheses by querying extensive single-cell datasets. Large single-cell datasets like Human Cell Atlas and COVID-19 cell atlas (collection of annotated sc datasets from various human organs) are excellent resources for profiling target genes involved in human diseases and disorders ranging from oncology, auto-immunity, as well as infectious diseases like COVID-19 caused by SARS-CoV-2 virus. SARS-CoV-2 infections have led to a worldwide pandemic with massive loss of lives, infections exceeding 7 million cases. The virus uses ACE2 and TMPRSS2 as key viral entry associated proteins expressed in human cells for infections. Evaluating the expression profile of key genes in large single-cell datasets can facilitate testing for diagnostics, therapeutics, and vaccine targets, as the world struggles to cope with the on-going spread of COVID-19 infections.

MAIN BODY

In this manuscript we describe REVEAL: SingleCell, which enables storage, retrieval, and rapid query of single-cell datasets inclusive of millions of cells. The array native database described here enables selecting and analyzing cells across multiple studies. Cells can be selected using individual metadata tags, more complex hierarchical ontology filtering, and gene expression threshold ranges, including co-expression of multiple genes. The tags on selected cells can be further evaluated for testing biological hypotheses. One such example includes identifying the most prevalent cell type annotation tag on returned cells. We used REVEAL: SingleCell to evaluate the expression of key SARS-CoV-2 entry associated genes, and queried the current database (2.2 Million cells, 32 projects) to obtain the results in < 60 s. We highlighted cells expressing COVID-19 associated genes are expressed on multiple tissue types, thus in part explains the multi-organ involvement in infected patients observed worldwide during the on-going COVID-19 pandemic.

CONCLUSION

In this paper, we introduce the REVEAL: SingleCell database that addresses immediate needs for SARS-CoV-2 research and has the potential to be used more broadly for many precision medicine applications. We used the REVEAL: SingleCell database as a reference to ask questions relevant to drug development and precision medicine regarding cell type and co-expression for genes that encode proteins necessary for SARS-CoV-2 to enter and reproduce in cells.

摘要

背景

单细胞测序对单个细胞进行无偏分析,并能够评估通常使用批量测序方法会错过的较不常见的细胞群体。然而,单细胞数据集的规模和复杂性在其应用中带来了巨大的挑战,当处理通常由联盟努力生成的更大数据集时,这个问题会进一步加剧。随着单细胞数据集的规模继续呈指数级增长,因此需要开发数据库平台,通过查询广泛的单细胞数据集来评估关键的生物学假设。人类细胞图谱和 COVID-19 细胞图谱等大型单细胞数据集(收集来自各种人体器官的注释单细胞数据集)是分析涉及从肿瘤学到自身免疫,以及由 SARS-CoV-2 病毒引起的传染病(如 COVID-19)等人类疾病和障碍的目标基因的极好资源。SARS-CoV-2 感染导致了全球大流行,造成了大量生命损失,感染人数超过 700 万例。该病毒使用 ACE2 和 TMPRSS2 作为其在人细胞中表达的关键病毒进入相关蛋白进行感染。评估大型单细胞数据集中关键基因的表达谱可以促进诊断、治疗和疫苗靶点的测试,因为世界正在努力应对 COVID-19 感染的持续传播。

主体

在本文中,我们描述了 REVEAL:SingleCell,它支持存储、检索和快速查询包含数百万个细胞的单细胞数据集。这里描述的数组本地数据库能够跨多个研究选择和分析细胞。可以使用单个元数据标签、更复杂的层次本体过滤以及基因表达阈值范围(包括多个基因的共表达)来选择细胞。可以进一步评估选定细胞的标签以测试生物学假设。一个这样的例子是确定返回细胞上最常见的细胞类型注释标签。我们使用 REVEAL:SingleCell 来评估关键 SARS-CoV-2 进入相关基因的表达,并查询当前数据库(220 万个细胞,32 个项目)以在 <60 秒内获得结果。我们强调表达 COVID-19 相关基因的细胞在多种组织类型上表达,因此部分解释了在全球范围内观察到的感染患者的多器官受累。

结论

在本文中,我们介绍了 REVEAL:SingleCell 数据库,该数据库满足了 SARS-CoV-2 研究的当前需求,并且具有更广泛地用于许多精准医疗应用的潜力。我们使用 REVEAL:SingleCell 数据库作为参考,提出了与药物开发和精准医疗相关的问题,涉及细胞类型和 SARS-CoV-2 进入和复制所需蛋白质编码基因的共表达。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf50/7786998/b7bdb658d354/12864_2020_7300_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验