Marchler-Bauer Aron, Lu Shennan, Anderson John B, Chitsaz Farideh, Derbyshire Myra K, DeWeese-Scott Carol, Fong Jessica H, Geer Lewis Y, Geer Renata C, Gonzales Noreen R, Gwadz Marc, Hurwitz David I, Jackson John D, Ke Zhaoxi, Lanczycki Christopher J, Lu Fu, Marchler Gabriele H, Mullokandov Mikhail, Omelchenko Marina V, Robertson Cynthia L, Song James S, Thanki Narmada, Yamashita Roxanne A, Zhang Dachuan, Zhang Naigong, Zheng Chanjuan, Bryant Stephen H
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bldg 38 A, Room 8N805, 8600 Rockville Pike, Bethesda, MD 20894, USA.
Nucleic Acids Res. 2011 Jan;39(Database issue):D225-9. doi: 10.1093/nar/gkq1189. Epub 2010 Nov 24.
NCBI's Conserved Domain Database (CDD) is a resource for the annotation of protein sequences with the location of conserved domain footprints, and functional sites inferred from these footprints. CDD includes manually curated domain models that make use of protein 3D structure to refine domain models and provide insights into sequence/structure/function relationships. Manually curated models are organized hierarchically if they describe domain families that are clearly related by common descent. As CDD also imports domain family models from a variety of external sources, it is a partially redundant collection. To simplify protein annotation, redundant models and models describing homologous families are clustered into superfamilies. By default, domain footprints are annotated with the corresponding superfamily designation, on top of which specific annotation may indicate high-confidence assignment of family membership. Pre-computed domain annotation is available for proteins in the Entrez/Protein dataset, and a novel interface, Batch CD-Search, allows the computation and download of annotation for large sets of protein queries. CDD can be accessed via http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml.
美国国立医学图书馆国家生物技术信息中心(NCBI)的保守结构域数据库(CDD)是一个用于注释蛋白质序列的资源库,可标注保守结构域足迹的位置以及从这些足迹推断出的功能位点。CDD包含人工整理的结构域模型,这些模型利用蛋白质三维结构来优化结构域模型,并深入了解序列/结构/功能之间的关系。如果人工整理的模型描述的是通过共同祖先明显相关的结构域家族,则会按层次结构进行组织。由于CDD还从各种外部来源导入结构域家族模型,所以它是一个部分冗余的集合。为简化蛋白质注释,冗余模型和描述同源家族的模型会被聚类为超家族。默认情况下,结构域足迹会用相应的超家族名称进行注释,在此基础上,特定注释可能表明家族成员的高可信度归属。Entrez/Protein数据集中的蛋白质可获取预先计算的结构域注释,并且一个全新的界面——批量CD搜索(Batch CD-Search),允许对大量蛋白质查询进行注释的计算和下载。可通过http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml访问CDD。