• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

构建一个用于将单细胞转录组学数据纳入农业基因组到表型组研究的公平数据生态系统。

Building a FAIR data ecosystem for incorporating single-cell transcriptomics data into agricultural genome to phenome research.

作者信息

Kapoor Muskan, Ventura Enrique Sapena, Walsh Amy, Sokolov Alexey, George Nancy, Kumari Sunita, Provart Nicholas J, Cole Benjamin, Libault Marc, Tickle Timothy, Warren Wesley C, Koltes James E, Papatheodorou Irene, Ware Doreen, Harrison Peter W, Elsik Christine, Yordanova Galabina, Burdett Tony, Tuggle Christopher K

机构信息

Department of Animal Science, Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, United States.

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, Cambridgeshire, United Kingdom.

出版信息

Front Genet. 2024 Nov 29;15:1460351. doi: 10.3389/fgene.2024.1460351. eCollection 2024.

DOI:10.3389/fgene.2024.1460351
PMID:39678381
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11638175/
Abstract

INTRODUCTION

The agriculture genomics community has numerous data submission standards available, but the standards for describing and storing single-cell (SC, e.g., scRNA- seq) data are comparatively underdeveloped.

METHODS

To bridge this gap, we leveraged recent advancements in human genomics infrastructure, such as the integration of the Human Cell Atlas Data Portal with Terra, a secure, scalable, open-source platform for biomedical researchers to access data, run analysis tools, and collaborate. In parallel, the Single Cell Expression Atlas at EMBL-EBI offers a comprehensive data ingestion portal for high-throughput sequencing datasets, including plants, protists, and animals (including humans). Developing data tools connecting these resources would offer significant advantages to the agricultural genomics community. The FAANG data portal at EMBL-EBI emphasizes delivering rich metadata and highly accurate and reliable annotation of farmed animals but is not computationally linked to either of these resources.

RESULTS

Herein, we describe a pilot-scale project that determines whether the current FAANG metadata standards for livestock can be used to ingest scRNA-seq datasets into Terra in a manner consistent with HCA Data Portal standards. Importantly, rich scRNA-seq metadata can now be brokered through the FAANG data portal using a semi-automated process, thereby avoiding the need for substantial expert curation. We have further extended the functionality of this tool so that validated and ingested SC files within the HCA Data Portal are transferred to Terra for further analysis. In addition, we verified data ingestion into Terra, hosted on Azure, and demonstrated the use of a workflow to analyze the first ingested porcine scRNA-seq dataset. Additionally, we have also developed prototype tools to visualize the output of scRNA-seq analyses on genome browsers to compare gene expression patterns across tissues and cell populations. This JBrowse tool now features distinct tracks, showcasing PBMC scRNA-seq alongside two bulk RNA-seq experiments.

DISCUSSION

We intend to further build upon these existing tools to construct a scientist-friendly data resource and analytical ecosystem based on Findable, Accessible, Interoperable, and Reusable (FAIR) SC principles to facilitate SC-level genomic analysis through data ingestion, storage, retrieval, re-use, visualization, and comparative annotation across agricultural species.

摘要

引言

农业基因组学界有众多可用的数据提交标准,但用于描述和存储单细胞(SC,例如scRNA-seq)数据的标准相对不够完善。

方法

为了弥合这一差距,我们利用了人类基因组学基础设施的最新进展,例如将人类细胞图谱数据门户与Terra集成,Terra是一个安全、可扩展的开源平台,供生物医学研究人员访问数据、运行分析工具并进行协作。与此同时,欧洲生物信息学研究所(EMBL-EBI)的单细胞表达图谱为高通量测序数据集提供了一个全面的数据摄入门户,包括植物、原生生物和动物(包括人类)。开发连接这些资源的数据工具将为农业基因组学界带来显著优势。EMBL-EBI的FAANG数据门户强调提供丰富的元数据以及对养殖动物的高度准确和可靠的注释,但在计算上与这两种资源均无关联。

结果

在此,我们描述了一个试点规模的项目,该项目确定当前FAANG针对家畜的元数据标准是否可用于以与人类细胞图谱数据门户标准一致的方式将scRNA-seq数据集摄入Terra。重要的是,现在可以通过FAANG数据门户使用半自动流程来处理丰富的scRNA-seq元数据,从而避免了大量专家编目的需求。我们进一步扩展了该工具的功能,以便将人类细胞图谱数据门户内经过验证和摄入的SC文件传输到Terra进行进一步分析。此外,我们验证了数据摄入到托管在Azure上的Terra中,并展示了使用工作流程来分析第一个摄入的猪scRNA-seq数据集。此外,我们还开发了原型工具,以在基因组浏览器上可视化scRNA-seq分析的输出,从而比较不同组织和细胞群体之间的基因表达模式。这个JBrowse工具现在具有不同的轨道,展示了PBMC scRNA-seq以及两个批量RNA-seq实验。

讨论

我们打算在这些现有工具的基础上进一步构建一个基于可查找、可访问、可互操作和可重用(FAIR)的单细胞原则的科学家友好型数据资源和分析生态系统,以通过跨农业物种的数据摄入、存储、检索、再利用、可视化和比较注释来促进单细胞水平的基因组分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4490/11638175/6a804ed24792/fgene-15-1460351-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4490/11638175/160008a9a7f6/fgene-15-1460351-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4490/11638175/9f275b8bee70/fgene-15-1460351-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4490/11638175/1cc39a420ef7/fgene-15-1460351-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4490/11638175/a7b44be39733/fgene-15-1460351-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4490/11638175/09250cf5a0d0/fgene-15-1460351-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4490/11638175/621246280426/fgene-15-1460351-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4490/11638175/6a804ed24792/fgene-15-1460351-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4490/11638175/160008a9a7f6/fgene-15-1460351-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4490/11638175/9f275b8bee70/fgene-15-1460351-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4490/11638175/1cc39a420ef7/fgene-15-1460351-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4490/11638175/a7b44be39733/fgene-15-1460351-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4490/11638175/09250cf5a0d0/fgene-15-1460351-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4490/11638175/621246280426/fgene-15-1460351-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4490/11638175/6a804ed24792/fgene-15-1460351-g007.jpg

相似文献

1
Building a FAIR data ecosystem for incorporating single-cell transcriptomics data into agricultural genome to phenome research.构建一个用于将单细胞转录组学数据纳入农业基因组到表型组研究的公平数据生态系统。
Front Genet. 2024 Nov 29;15:1460351. doi: 10.3389/fgene.2024.1460351. eCollection 2024.
2
The FAANG Data Portal: Global, Open-Access, "FAIR", and Richly Validated Genotype to Phenotype Data for High-Quality Functional Annotation of Animal Genomes.FAANG数据门户:用于动物基因组高质量功能注释的全球开放获取、“FAIR”且经过充分验证的基因型到表型数据
Front Genet. 2021 Jun 17;12:639238. doi: 10.3389/fgene.2021.639238. eCollection 2021.
3
FAANG, establishing metadata standards, validation and best practices for the farmed and companion animal community.FAANG正在为养殖动物和伴侣动物群体建立元数据标准、验证方法和最佳实践。
Anim Genet. 2018 Dec;49(6):520-526. doi: 10.1111/age.12736. Epub 2018 Oct 12.
4
Reference Transcriptomes of Porcine Peripheral Immune Cells Created Through Bulk and Single-Cell RNA Sequencing.通过批量和单细胞RNA测序创建的猪外周免疫细胞参考转录组
Front Genet. 2021 Jun 23;12:689406. doi: 10.3389/fgene.2021.689406. eCollection 2021.
5
Recommendations for the FAIRification of genomic track metadata.基因组轨迹元数据的 FAIR 化推荐。
F1000Res. 2021 Apr 1;10. doi: 10.12688/f1000research.28449.1. eCollection 2021.
6
Hydrop enables droplet-based single-cell ATAC-seq and single-cell RNA-seq using dissolvable hydrogel beads.Hydrop 可利用可溶解水凝胶珠进行基于液滴的单细胞 ATAC-seq 和单细胞 RNA-seq。
Elife. 2022 Feb 23;11:e73971. doi: 10.7554/eLife.73971.
7
"METAGENOTE: a simplified web platform for metadata annotation of genomic samples and streamlined submission to NCBI's sequence read archive".METAGENOTE:一个简化的基因组样本元数据注释的网络平台,简化了向 NCBI 的序列读取档案提交的流程。
BMC Bioinformatics. 2020 Sep 3;21(1):378. doi: 10.1186/s12859-020-03694-0.
8
IRIS-EDA: An integrated RNA-Seq interpretation system for gene expression data analysis.IRIS-EDA:一个用于基因表达数据分析的集成 RNA-Seq 解读系统。
PLoS Comput Biol. 2019 Feb 14;15(2):e1006792. doi: 10.1371/journal.pcbi.1006792. eCollection 2019 Feb.
9
Expression Atlas update: gene and protein expression in multiple species.ExpressionAtlas 更新:多种物种中的基因和蛋白质表达。
Nucleic Acids Res. 2022 Jan 7;50(D1):D129-D140. doi: 10.1093/nar/gkab1030.
10
PlaqView 2.0: A comprehensive web portal for cardiovascular single-cell genomics.PlaqView 2.0:心血管单细胞基因组学的综合网络平台。
Front Cardiovasc Med. 2022 Aug 8;9:969421. doi: 10.3389/fcvm.2022.969421. eCollection 2022.

本文引用的文献

1
CZ CELLxGENE Discover: a single-cell data platform for scalable exploration, analysis and modeling of aggregated data.CZ CELLxGENE发现平台:一个用于对聚合数据进行可扩展探索、分析和建模的单细胞数据平台。
Nucleic Acids Res. 2025 Jan 6;53(D1):D886-D900. doi: 10.1093/nar/gkae1142.
2
BacteSign: Building a Findable, Accessible, Interoperable, and Reusable (FAIR) Database for Universal Bacterial Identification.BacteSign:构建可查找、可访问、可互操作和可重复使用(FAIR)的通用细菌识别数据库。
Biosensors (Basel). 2024 Apr 5;14(4):176. doi: 10.3390/bios14040176.
3
Improving prenatal diagnosis through standards and aggregation.
通过标准和聚合提高产前诊断水平。
Prenat Diagn. 2024 Apr;44(4):454-464. doi: 10.1002/pd.6522. Epub 2024 Jan 19.
4
Best practices for the execution, analysis, and data storage of plant single-cell/nucleus transcriptomics.植物单细胞/细胞核转录组学的执行、分析和数据存储的最佳实践。
Plant Cell. 2024 Mar 29;36(4):812-828. doi: 10.1093/plcell/koae003.
5
The UCSC Genome Browser database: 2024 update.UCSC 基因组浏览器数据库:2024 年更新。
Nucleic Acids Res. 2024 Jan 5;52(D1):D1082-D1088. doi: 10.1093/nar/gkad987.
6
HMGB2 regulates the differentiation and stemness of exhausted CD8 T cells during chronic viral infection and cancer.HMGB2 在慢性病毒感染和癌症中调节耗竭的 CD8 T 细胞的分化和干性。
Nat Commun. 2023 Sep 13;14(1):5631. doi: 10.1038/s41467-023-41352-0.
7
scPlantDB: a comprehensive database for exploring cell types and markers of plant cell atlases.scPlantDB:一个用于探索植物细胞图谱细胞类型和标志物的综合数据库。
Nucleic Acids Res. 2024 Jan 5;52(D1):D1629-D1638. doi: 10.1093/nar/gkad706.
8
Integrative profiling of gene expression and chromatin accessibility elucidates specific transcriptional networks in porcine neutrophils.基因表达与染色质可及性的综合分析揭示了猪中性粒细胞中的特定转录网络。
Front Genet. 2023 May 23;14:1107462. doi: 10.3389/fgene.2023.1107462. eCollection 2023.
9
Challenges to sharing sample metadata in computational genomics.计算基因组学中样本元数据共享面临的挑战。
Front Genet. 2023 May 23;14:1154198. doi: 10.3389/fgene.2023.1154198. eCollection 2023.
10
National Human Genome Research Institute Genomic Data Science Analysis, Visualization, and Informatics Lab-Space: Reaching out to Clinicians.国家人类基因组研究所基因组数据科学分析、可视化与信息学实验室空间:与临床医生建立联系。
Circ Genom Precis Med. 2023 Jun;16(3):275-276. doi: 10.1161/CIRCGEN.122.003936. Epub 2023 Apr 4.