Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK.
Nucleic Acids Res. 2012 Jan;40(Database issue):D290-301. doi: 10.1093/nar/gkr1065. Epub 2011 Nov 29.
Pfam is a widely used database of protein families, currently containing more than 13,000 manually curated protein families as of release 26.0. Pfam is available via servers in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/). Here, we report on changes that have occurred since our 2010 NAR paper (release 24.0). Over the last 2 years, we have generated 1840 new families and increased coverage of the UniProt Knowledgebase (UniProtKB) to nearly 80%. Notably, we have taken the step of opening up the annotation of our families to the Wikipedia community, by linking Pfam families to relevant Wikipedia pages and encouraging the Pfam and Wikipedia communities to improve and expand those pages. We continue to improve the Pfam website and add new visualizations, such as the 'sunburst' representation of taxonomic distribution of families. In this work we additionally address two topics that will be of particular interest to the Pfam community. First, we explain the definition and use of family-specific, manually curated gathering thresholds. Second, we discuss some of the features of domains of unknown function (also known as DUFs), which constitute a rapidly growing class of families within Pfam.
Pfam 是一个广泛使用的蛋白质家族数据库,截至版本 26.0,目前包含超过 13000 个经过人工精心整理的蛋白质家族。Pfam 可通过英国服务器(http://pfam.sanger.ac.uk/)、美国服务器(http://pfam.janelia.org/)和瑞典服务器(http://pfam.sbc.su.se/)访问。在这里,我们报告了自 2010 年 NAR 论文(版本 24.0)以来发生的变化。在过去的两年中,我们生成了 1840 个新家族,并将 UniProt Knowledgebase(UniProtKB)的覆盖率提高到近 80%。值得注意的是,我们已经采取措施将家族注释开放给维基百科社区,将 Pfam 家族与相关的维基百科页面链接,并鼓励 Pfam 和维基百科社区改进和扩展这些页面。我们继续改进 Pfam 网站,并添加新的可视化效果,例如家族分类分布的“太阳辐射”表示。在这项工作中,我们还解决了两个对 Pfam 社区特别感兴趣的主题。首先,我们解释了家族特定的、人工精心整理的收集阈值的定义和用途。其次,我们讨论了未知功能域(也称为 DUFs)的一些特征,它们在 Pfam 中构成了一个快速增长的家族类别。