Lehne Benjamin, Schlitt Thomas
Department of Medical and Molecular Genetics, Kings College London, Guy's Campus, London, UK.
Hum Genomics. 2009 Apr;3(3):291-7. doi: 10.1186/1479-7364-3-3-291.
Over the past few years, the number of known protein-protein interactions has increased substantially. To make this information more readily available, a number of publicly available databases have set out to collect and store protein-protein interaction data. Protein-protein interactions have been retrieved from six major databases, integrated and the results compared. The six databases (the Biological General Repository for Interaction Datasets [BioGRID], the Molecular INTeraction database [MINT], the Biomolecular Interaction Network Database [BIND], the Database of Interacting Proteins [DIP], the IntAct molecular interaction database [IntAct] and the Human Protein Reference Database [HPRD]) differ in scope and content; integration of all datasets is non-trivial owing to differences in data annotation. With respect to human protein-protein interaction data, HPRD seems to be the most comprehensive. To obtain a complete dataset, however, interactions from all six databases have to be combined. To overcome this limitation, meta-databases such as the Agile Protein Interaction Database (APID) offer access to integrated protein-protein interaction datasets, although these also currently have certain restrictions.
在过去几年中,已知的蛋白质-蛋白质相互作用的数量大幅增加。为了使这些信息更易于获取,一些公开可用的数据库已着手收集和存储蛋白质-蛋白质相互作用数据。已从六个主要数据库中检索蛋白质-蛋白质相互作用数据,进行整合并比较结果。这六个数据库(相互作用数据集生物学通用知识库[BioGRID]、分子相互作用数据库[MINT]、生物分子相互作用网络数据库[BIND]、相互作用蛋白质数据库[DIP]、IntAct分子相互作用数据库[IntAct]和人类蛋白质参考数据库[HPRD])在范围和内容上存在差异;由于数据注释的不同,整合所有数据集并非易事。就人类蛋白质-蛋白质相互作用数据而言,HPRD似乎最为全面。然而,为了获得完整的数据集,必须将来自所有六个数据库的相互作用数据合并。为了克服这一限制,诸如敏捷蛋白质相互作用数据库(APID)之类的元数据库提供了对整合的蛋白质-蛋白质相互作用数据集的访问,尽管这些数据库目前也有一定的限制。