• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

系统地连接tranSMART、Galaxy和EGA以重用人类转化研究数据。

Systematically linking tranSMART, Galaxy and EGA for reusing human translational research data.

作者信息

Zhang Chao, Bijlard Jochem, Staiger Christine, Scollen Serena, van Enckevort David, Hoogstrate Youri, Senf Alexander, Hiltemann Saskia, Repo Susanna, Pipping Wibo, Bierkens Mariska, Payralbe Stefan, Stringer Bas, Heringa Jaap, Stubbs Andrew, Bonino Da Silva Santos Luiz Olavo, Belien Jeroen, Weistra Ward, Azevedo Rita, van Bochove Kees, Meijer Gerrit, Boiten Jan-Willem, Rambla Jordi, Fijneman Remond, Spalding J Dylan, Abeln Sanne

机构信息

Department of Computer Science, Vrije Universiteit Amsterdam, Amsterdam, 1081 HV, Netherlands.

The Hyve, Utrecht, 3511 MJ, Netherlands.

出版信息

F1000Res. 2017 Aug 16;6. doi: 10.12688/f1000research.12168.1. eCollection 2017.

DOI:10.12688/f1000research.12168.1
PMID:29123641
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5657030/
Abstract

The availability of high-throughput molecular profiling techniques has provided more accurate and informative data for regular clinical studies. Nevertheless, complex computational workflows are required to interpret these data. Over the past years, the data volume has been growing explosively, requiring robust human data management to organise and integrate the data efficiently. For this reason, we set up an ELIXIR implementation study, together with the Translational research IT (TraIT) programme, to design a data ecosystem that is able to link raw and interpreted data. In this project, the data from the TraIT Cell Line Use Case (TraIT-CLUC) are used as a test case for this system. Within this ecosystem, we use the European Genome-phenome Archive (EGA) to store raw molecular profiling data; tranSMART to collect interpreted molecular profiling data and clinical data for corresponding samples; and Galaxy to store, run and manage the computational workflows. We can integrate these data by linking their repositories systematically. To showcase our design, we have structured the TraIT-CLUC data, which contain a variety of molecular profiling data types, for storage in both tranSMART and EGA. The metadata provided allows referencing between tranSMART and EGA, fulfilling the cycle of data submission and discovery; we have also designed a data flow from EGA to Galaxy, enabling reanalysis of the raw data in Galaxy. In this way, users can select patient cohorts in tranSMART, trace them back to the raw data and perform (re)analysis in Galaxy. Our conclusion is that the majority of metadata does not necessarily need to be stored (redundantly) in both databases, but that instead FAIR persistent identifiers should be available for well-defined data ontology levels: study, data access committee, physical sample, data sample and raw data file. This approach will pave the way for the stable linkage and reuse of data.

摘要

高通量分子谱分析技术的出现为常规临床研究提供了更准确、更丰富的信息数据。然而,需要复杂的计算工作流程来解读这些数据。在过去几年中,数据量呈爆炸式增长,需要强大的人力数据管理来有效地组织和整合数据。因此,我们与转化研究信息技术(TraIT)计划一起开展了一项ELIXIR实施研究,以设计一个能够链接原始数据和解读后数据的数据生态系统。在这个项目中,来自TraIT细胞系用例(TraIT-CLUC)的数据被用作该系统的测试用例。在这个生态系统中,我们使用欧洲基因组-表型档案库(EGA)来存储原始分子谱分析数据;使用tranSMART来收集相应样本的解读后分子谱分析数据和临床数据;使用Galaxy来存储、运行和管理计算工作流程。我们可以通过系统地链接它们的存储库来整合这些数据。为了展示我们的设计,我们对包含各种分子谱分析数据类型的TraIT-CLUC数据进行了结构化处理,以便存储在tranSMART和EGA中。所提供的元数据允许在tranSMART和EGA之间进行引用,完成数据提交和发现的循环;我们还设计了从EGA到Galaxy的数据流,使Galaxy能够对原始数据进行重新分析。通过这种方式,用户可以在tranSMART中选择患者队列,追溯到原始数据并在Galaxy中进行(重新)分析。我们的结论是,大多数元数据不一定需要(重复)存储在两个数据库中,而是应该为定义明确的数据本体级别提供FAIR持久标识符:研究、数据访问委员会、物理样本、数据样本和原始数据文件。这种方法将为数据的稳定链接和重用铺平道路。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e5f8/5657030/440d36b8b6bb/f1000research-6-13170-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e5f8/5657030/bb7d25b087d3/f1000research-6-13170-g0000.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e5f8/5657030/3ed311c0b79d/f1000research-6-13170-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e5f8/5657030/3eeb5ce36761/f1000research-6-13170-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e5f8/5657030/57ad2a4544cf/f1000research-6-13170-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e5f8/5657030/440d36b8b6bb/f1000research-6-13170-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e5f8/5657030/bb7d25b087d3/f1000research-6-13170-g0000.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e5f8/5657030/3ed311c0b79d/f1000research-6-13170-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e5f8/5657030/3eeb5ce36761/f1000research-6-13170-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e5f8/5657030/57ad2a4544cf/f1000research-6-13170-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e5f8/5657030/440d36b8b6bb/f1000research-6-13170-g0004.jpg

相似文献

1
Systematically linking tranSMART, Galaxy and EGA for reusing human translational research data.系统地连接tranSMART、Galaxy和EGA以重用人类转化研究数据。
F1000Res. 2017 Aug 16;6. doi: 10.12688/f1000research.12168.1. eCollection 2017.
2
Integration of EGA secure data access into Galaxy.将EGA安全数据访问集成到Galaxy中。
F1000Res. 2016 Dec 12;5. doi: 10.12688/f1000research.10221.1. eCollection 2016.
3
The European Genome-phenome Archive in 2021.2021 年的欧洲基因组-表型数据库。
Nucleic Acids Res. 2022 Jan 7;50(D1):D980-D987. doi: 10.1093/nar/gkab1059.
4
FAIR data retrieval for sensitive clinical research data in Galaxy.在Galaxy中对敏感临床研究数据进行公平的数据检索。
Gigascience. 2024 Jan 2;13. doi: 10.1093/gigascience/giad099.
5
EGAsubmitter: A software to automate submission of nucleic acid sequencing data to the European Genome-phenome Archive.EGA提交工具:一种用于将核酸测序数据自动提交至欧洲基因组-表型组档案库的软件。
Front Bioinform. 2023 Mar 30;3:1143014. doi: 10.3389/fbinf.2023.1143014. eCollection 2023.
6
tranSMART-XNAT Connector tranSMART-XNAT connector-image selection based on clinical phenotypes and genetic profiles.tranSMART-XNAT连接器 tranSMART-XNAT连接器——基于临床表型和基因图谱的图像选择。
Bioinformatics. 2017 Mar 1;33(5):787-788. doi: 10.1093/bioinformatics/btw714.
7
A quality control portal for sequencing data deposited at the European genome-phenome archive.一个用于在欧洲基因组-表型档案库中存储的测序数据的质量控制门户。
Brief Bioinform. 2022 May 13;23(3). doi: 10.1093/bib/bbac136.
8
Usability and Suitability of the Omics-Integrating Analysis Platform tranSMART for Translational Research and Education.用于转化研究与教育的组学整合分析平台tranSMART的可用性与适用性
Appl Clin Inform. 2017 Oct;8(4):1173-1183. doi: 10.4338/ACI-2017-05-RA-0085. Epub 2017 Dec 21.
9
Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification头部损伤的转化代谢组学:基于体外核磁共振波谱的代谢物定量分析探索脑代谢功能障碍
10
Developing Interactive Plug-ins for tranSMART Using the SmartR Framework: The Case of Survival Analysis.使用SmartR框架为tranSMART开发交互式插件:生存分析案例
Stud Health Technol Inform. 2017;236:375-382.

引用本文的文献

1
A pan-cancer bioinformatic analysis of the carcinogenic role of SMARCA1 in human carcinomas.SMARCA1 在人类癌中的致癌作用的泛癌症生物信息学分析。
PLoS One. 2022 Sep 20;17(9):e0274823. doi: 10.1371/journal.pone.0274823. eCollection 2022.
2
Comprehensive bioinformatic analysis of MMP1 in hepatocellular carcinoma and establishment of relevant prognostic model.全面的生物信息学分析 MMP1 在肝癌中的作用并建立相关预后模型。
Sci Rep. 2022 Aug 10;12(1):13639. doi: 10.1038/s41598-022-17954-x.
3
Fusion transcripts and their genomic breakpoints in polyadenylated and ribosomal RNA-minus RNA sequencing data.

本文引用的文献

1
The Dockstore: enabling modular, community-focused sharing of Docker-based genomics tools and workflows.码头仓库:实现基于Docker的基因组学工具和工作流程的模块化、以社区为中心的共享。
F1000Res. 2017 Jan 18;6:52. doi: 10.12688/f1000research.10137.1. eCollection 2017.
2
SmartR: an open-source platform for interactive visual analytics for translational research data.SmartR:用于转化研究数据的交互式可视分析的开源平台。
Bioinformatics. 2017 Jul 15;33(14):2229-2231. doi: 10.1093/bioinformatics/btx137.
3
Integration of EGA secure data access into Galaxy.
多聚腺苷酸化和核糖体 RNA-减 RNA 测序数据中的融合转录本及其基因组断点。
Gigascience. 2021 Dec 9;10(12). doi: 10.1093/gigascience/giab080.
将EGA安全数据访问集成到Galaxy中。
F1000Res. 2016 Dec 12;5. doi: 10.12688/f1000research.10221.1. eCollection 2016.
4
CIViC is a community knowledgebase for expert crowdsourcing the clinical interpretation of variants in cancer.CIViC 是一个社区知识库,用于专家众包对癌症变异的临床解释。
Nat Genet. 2017 Jan 31;49(2):170-174. doi: 10.1038/ng.3774.
5
Galaxy Workflows for Web-based Bioinformatics Analysis of Aptamer High-throughput Sequencing Data.用于适体高通量测序数据基于网络的生物信息学分析的Galaxy工作流程。
Mol Ther Nucleic Acids. 2016;5(8):e345. doi: 10.1038/mtna.2016.54.
6
How Do Scientists Define Openness? Exploring the Relationship Between Open Science Policies and Research Practice.科学家如何定义开放性?探索开放科学政策与研究实践之间的关系。
Bull Sci Technol Soc. 2016 Jun;36(2):128-141. doi: 10.1177/0270467616668760. Epub 2016 Jun 1.
7
Toward a Shared Vision for Cancer Genomic Data.迈向癌症基因组数据的共同愿景。
N Engl J Med. 2016 Sep 22;375(12):1109-12. doi: 10.1056/NEJMp1607591.
8
Is the $1000 Genome as Near as We Think? A Cost Analysis of Next-Generation Sequencing.《千美元基因组是否近在咫尺?下一代测序的成本分析》
Clin Chem. 2016 Nov;62(11):1458-1464. doi: 10.1373/clinchem.2016.258632. Epub 2016 Sep 14.
9
Open data in drug discovery and development: lessons from malaria.药物发现和开发中的开放数据:疟疾的经验教训。
Nat Rev Drug Discov. 2016 Oct;15(10):661-2. doi: 10.1038/nrd.2016.154. Epub 2016 Aug 12.
10
How open science helps researchers succeed.开放科学如何助力研究人员取得成功。
Elife. 2016 Jul 7;5:e16800. doi: 10.7554/eLife.16800.