• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

联合数据库——基于生物样本库的后基因组研究的基础,整合了来自欧洲60万对双胞胎的表型组和基因组数据。

The federated database--a basis for biobank-based post-genome studies, integrating phenome and genome data from 600,000 twin pairs in Europe.

作者信息

Muilu Juha, Peltonen Leena, Litton Jan-Eric

机构信息

Finnish Genome Center, University of Helsinki, Helsinki, Finland.

出版信息

Eur J Hum Genet. 2007 Jul;15(7):718-23. doi: 10.1038/sj.ejhg.5201850. Epub 2007 May 9.

DOI:10.1038/sj.ejhg.5201850
PMID:17487219
Abstract

Integration of complex data and data management represent major challenges in large-scale biobank-based post-genome era research projects like GenomEUtwin (an international collaboration between eight Twin Registries) with extensive amounts of genotype and phenotype data combined from different data sources located in different countries. The challenge lies not only in data harmonization and constant update of clinical details in various locations, but also in the heterogeneity of data storage and confidentiality of sensitive health-related and genetic data. Solid infrastructure must be built to provide secure, but easily accessible and standardized, data exchange also facilitating statistical analyses of the stored data. Data collection sites desire to have full control of the accumulation of data, and at the same time the integration should facilitate effortless slicing and dicing of the data for different types of data pooling and study designs. Here we describe how we constructed a federated database infrastructure for genotype and phenotype information collected in seven European countries and Australia and connected this database setting via a network called TwinNET to guarantee effortless data exchange and pooled analyses. This federated database system offers a powerful facility for combining different types of information from multiple data sources. The system is transparent to end users and application developers, since it makes the set of federated data sources look like a single system. The user need not be aware of the format or site where the data are stored, the language or programming interface of the data source, how the data are physically stored, whether they are partitioned and/or replicated or what networking protocols are used. The user sees a single standardized interface with the desired data elements for pooled analyses.

摘要

在基于大型生物样本库的后基因组时代研究项目中,如GenomEUtwin(八个双胞胎登记处之间的国际合作项目),整合复杂数据和数据管理面临重大挑战,该项目整合了来自不同国家不同数据源的大量基因型和表型数据。挑战不仅在于数据协调以及不同地点临床细节的持续更新,还在于数据存储的异质性以及敏感健康相关数据和遗传数据的保密性。必须构建坚实的基础设施,以提供安全但易于访问且标准化的数据交换,同时便于对存储的数据进行统计分析。数据收集站点希望能完全掌控数据的积累,与此同时,整合应便于轻松地对数据进行切片和切块,以用于不同类型的数据汇总和研究设计。在此,我们描述了我们如何为在七个欧洲国家和澳大利亚收集的基因型和表型信息构建一个联邦数据库基础设施,并通过一个名为TwinNET的网络连接此数据库设置,以确保轻松的数据交换和汇总分析。这个联邦数据库系统为整合来自多个数据源的不同类型信息提供了强大的工具。该系统对终端用户和应用程序开发者是透明的,因为它使联邦数据源集看起来像一个单一系统。用户无需知晓数据存储的格式或地点、数据源的语言或编程接口、数据的物理存储方式、是否进行了分区和/或复制,以及使用了何种网络协议。用户看到的是一个带有用于汇总分析的所需数据元素的单一标准化接口。

相似文献

1
The federated database--a basis for biobank-based post-genome studies, integrating phenome and genome data from 600,000 twin pairs in Europe.联合数据库——基于生物样本库的后基因组研究的基础,整合了来自欧洲60万对双胞胎的表型组和基因组数据。
Eur J Hum Genet. 2007 Jul;15(7):718-23. doi: 10.1038/sj.ejhg.5201850. Epub 2007 May 9.
2
Interface analysis between GSVML and HL7 version 3.GSVML与HL7第3版之间的接口分析
J Biomed Inform. 2007 Oct;40(5):527-38. doi: 10.1016/j.jbi.2006.12.006. Epub 2006 Dec 24.
3
Integrating clinical and laboratory data in genetic studies of complex phenotypes: a network-based data management system.整合复杂表型遗传研究中的临床和实验室数据:基于网络的数据管理系统。
Am J Med Genet. 1998 May 8;81(3):248-56.
4
Properties of a federated epidemiology query system.联合流行病学查询系统的属性。
Int J Med Inform. 2007 Sep;76(9):664-76. doi: 10.1016/j.ijmedinf.2006.05.040. Epub 2006 Sep 1.
5
PROPHECY--a yeast phenome database, update 2006.PROPHECY——一个酵母表型组数据库,2006年更新版
Nucleic Acids Res. 2007 Jan;35(Database issue):D463-7. doi: 10.1093/nar/gkl1029. Epub 2006 Dec 5.
6
HGVbase: a curated resource describing human DNA variation and phenotype relationships.HGVbase:一个描述人类DNA变异与表型关系的精选资源库。
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D516-9. doi: 10.1093/nar/gkh111.
7
Genomic Sequence Variation Markup Language (GSVML).基因组序列变异标记语言(GSVML)。
Int J Med Inform. 2010 Feb;79(2):130-42. doi: 10.1016/j.ijmedinf.2009.11.003. Epub 2009 Dec 6.
8
Somatic mutation databases as tools for molecular epidemiology and molecular pathology of cancer: proposed guidelines for improving data collection, distribution, and integration.体细胞突变数据库作为癌症分子流行病学和分子病理学的工具:关于改进数据收集、分发和整合的拟议指南。
Hum Mutat. 2009 Mar;30(3):275-82. doi: 10.1002/humu.20832.
9
A1ATVar: a relational database of human SERPINA1 gene variants leading to alpha1-antitrypsin deficiency and application of the VariVis software.A1ATVar:一个导致α1-抗胰蛋白酶缺乏的人类SERPINA1基因变异关系数据库及VariVis软件的应用
Hum Mutat. 2009 Mar;30(3):308-13. doi: 10.1002/humu.20857.
10
The European Prader-Willi Syndrome Clinical Research Database: an aid in the investigation of a rare genetically determined neurodevelopmental disorder.欧洲普拉德-威利综合征临床研究数据库:助力罕见基因决定的神经发育障碍研究
J Intellect Disabil Res. 2009 Jun;53(6):538-47. doi: 10.1111/j.1365-2788.2009.01172.x. Epub 2009 Apr 23.

引用本文的文献

1
Sharing Medical Big Data While Preserving Patient Confidentiality in Innovative Medicines Initiative: A Summary and Case Report from BigData@Heart.在创新药物倡议中共享医学大数据同时保护患者隐私:BigData@Heart 的总结和案例报告。
Big Data. 2023 Dec;11(6):399-407. doi: 10.1089/big.2022.0178. Epub 2023 Oct 27.
2
The governance structure for data access in the DIRECT consortium: an innovative medicines initiative (IMI) project.DIRECT联盟中数据访问的治理结构:一项创新药物倡议(IMI)项目。
Life Sci Soc Policy. 2018 Sep 4;14(1):20. doi: 10.1186/s40504-018-0083-0.
3
Maelstrom Research guidelines for rigorous retrospective data harmonization.
大漩涡研究严格回顾性数据协调指南。
Int J Epidemiol. 2017 Feb 1;46(1):103-105. doi: 10.1093/ije/dyw075.
4
Development of an Integrated Biospecimen Database among the Regional Biobanks in Korea.韩国地区生物样本库之间综合生物样本数据库的开发。
Healthc Inform Res. 2016 Apr;22(2):129-41. doi: 10.4258/hir.2016.22.2.129. Epub 2016 Apr 30.
5
ViPAR: a software platform for the Virtual Pooling and Analysis of Research Data.ViPAR:一个用于研究数据虚拟合并与分析的软件平台。
Int J Epidemiol. 2016 Apr;45(2):408-416. doi: 10.1093/ije/dyv193. Epub 2015 Oct 8.
6
Data harmonization and federated analysis of population-based studies: the BioSHaRE project.基于人群研究的数据协调与联合分析:BioSHaRE项目。
Emerg Themes Epidemiol. 2013 Nov 21;10(1):12. doi: 10.1186/1742-7622-10-12.
7
Linkage of data from diverse data sources (LDS): a data combination model provides clinical data of corresponding specimens in biobanking information system.来自不同数据源的数据链接(LDS):一种数据组合模型可提供生物样本库信息系统中相应样本的临床数据。
J Med Syst. 2013 Oct;37(5):9975. doi: 10.1007/s10916-013-9975-y. Epub 2013 Sep 11.
8
The International Collaboration for Autism Registry Epidemiology (iCARE): multinational registry-based investigations of autism risk factors and trends.国际自闭症注册研究协作组织(iCARE):基于多国注册的自闭症风险因素和趋势的调查研究。
J Autism Dev Disord. 2013 Nov;43(11):2650-63. doi: 10.1007/s10803-013-1815-x.
9
Comprehensive catalog of European biobanks.欧洲生物样本库综合目录。
Nat Biotechnol. 2011 Sep 8;29(9):795-7. doi: 10.1038/nbt.1958.
10
Integrating clinical information in National Biobank of Korea.整合韩国国家生物银行的临床信息。
J Med Syst. 2011 Aug;35(4):647-56. doi: 10.1007/s10916-009-9402-6. Epub 2009 Dec 9.