Centre de Recherches de Biochimie Macromoléculaire UMR 5237, CNRS, University of Montpellier 1 and 2, Montpellier, France.
Proteomics. 2012 May;12(9):1333-6. doi: 10.1002/pmic.201100534.
Rapidly increasing genomic data present new challenges for scientists: making sense of millions of amino acid sequences requires a systematic approach and information about their 3D structure, function, and evolution. Over the last decade, numerous studies demonstrated the fundamental importance of protein tandem repeats and their involvement in human diseases. Bioinformatics analysis of these regions requires special computer programs and databases, since the conventional approaches predominantly developed for globular domains have limited success. To perform a global comparative analysis of protein tandem repeats, we developed the Protein Tandem Repeat DataBase (PRDB). PRDB is a curated database that includes the protein tandem repeats found in sequence databanks by the T-REKS program. The database is available at http://bioinfo.montp.cnrs.fr/?r=repeatDB.
要理解数百万个氨基酸序列,需要采用系统的方法,并了解它们的 3D 结构、功能和进化。在过去的十年中,许多研究表明蛋白质串联重复的重要性及其在人类疾病中的作用。对这些区域的生物信息学分析需要特殊的计算机程序和数据库,因为主要为球形结构域开发的传统方法成功有限。为了对蛋白质串联重复进行全局比较分析,我们开发了蛋白质串联重复数据库(PRDB)。PRDB 是一个经过整理的数据库,其中包含 T-REKS 程序在序列数据库中发现的蛋白质串联重复。该数据库可在 http://bioinfo.montp.cnrs.fr/?r=repeatDB 上获取。