Department of Biochemistry and Molecular Biology, University College London, Gower St, London WC1 6BT, UK.
Nucleic Acids Res. 2010 Jan;38(Database issue):D296-300. doi: 10.1093/nar/gkp987. Epub 2009 Nov 11.
Over the last 2 years the Gene3D resource has been significantly improved, and is now more accurate and with a much richer interactive display via the Gene3D website (http://gene3d.biochem.ucl.ac.uk/). Gene3D provides accurate structural domain family assignments for over 1100 genomes and nearly 10,000,000 proteins. A hidden Markov model library, constructed from the manually curated CATH structural domain hierarchy, is used to search UniProt, RefSeq and Ensembl protein sequences. The resulting matches are refined into simple multi-domain architectures using a recently developed in-house algorithm, DomainFinder 3 (available at: ftp://ftp.biochem.ucl.ac.uk/pub/gene3d_data/DomainFinder3/). The domain assignments are integrated with multiple external protein function descriptions (e.g. Gene Ontology and KEGG), structural annotations (e.g. coiled coils, disordered regions and sequence polymorphisms) and family resources (e.g. Pfam and eggNog) and displayed on the Gene3D website. The website allows users to view descriptions for both single proteins and genes and large protein sets, such as superfamilies or genomes. Subsets can then be selected for detailed investigation or associated functions and interactions can be used to expand explorations to new proteins. Gene3D also provides a set of services, including an interactive genome coverage graph visualizer, DAS annotation resources, sequence search facilities and SOAP services.
在过去的 2 年中,Gene3D 资源得到了显著的改进,现在通过 Gene3D 网站(http://gene3d.biochem.ucl.ac.uk/)提供更准确、更丰富的交互式显示。Gene3D 为超过 1100 个基因组和近 1000 万个蛋白质提供准确的结构域家族分配。一个隐藏的 Markov 模型库,由手工整理的 CATH 结构域层次结构构建而成,用于搜索 UniProt、RefSeq 和 Ensembl 蛋白质序列。使用最近开发的内部算法 DomainFinder 3(可在:ftp://ftp.biochem.ucl.ac.uk/pub/gene3d_data/DomainFinder3/ 获得)将得到的匹配结果细化为简单的多结构域架构。结构域分配与多个外部蛋白质功能描述(如基因本体论和 KEGG)、结构注释(如卷曲螺旋、无序区域和序列多态性)和家族资源(如 Pfam 和 eggNog)集成,并显示在 Gene3D 网站上。该网站允许用户查看单个蛋白质和基因以及大型蛋白质组(如超家族或基因组)的描述。然后可以选择子集进行详细调查,或使用相关功能和交互作用将探索扩展到新的蛋白质。Gene3D 还提供了一组服务,包括交互式基因组覆盖图可视化工具、DAS 注释资源、序列搜索工具和 SOAP 服务。