转座元件家族、序列模型和基因组注释的Dfam社区资源。

The Dfam community resource of transposable element families, sequence models, and genome annotations.

作者信息

Storer Jessica, Hubley Robert, Rosen Jeb, Wheeler Travis J, Smit Arian F

机构信息

Institute for Systems Biology, Seattle, WA, 98109, USA.

University of Montana, Missoula, MT, 59812, USA.

出版信息

Mob DNA. 2021 Jan 12;12(1):2. doi: 10.1186/s13100-020-00230-y.

Abstract

Dfam is an open access database of repetitive DNA families, sequence models, and genome annotations. The 3.0-3.3 releases of Dfam ( https://dfam.org ) represent an evolution from a proof-of-principle collection of transposable element families in model organisms into a community resource for a broad range of species, and for both curated and uncurated datasets. In addition, releases since Dfam 3.0 provide auxiliary consensus sequence models, transposable element protein alignments, and a formalized classification system to support the growing diversity of organisms represented in the resource. The latest release includes 266,740 new de novo generated transposable element families from 336 species contributed by the EBI. This expansion demonstrates the utility of many of Dfam's new features and provides insight into the long term challenges ahead for improving de novo generated transposable element datasets.

摘要

Dfam是一个关于重复DNA家族、序列模型和基因组注释的开放获取数据库。Dfam 3.0 - 3.3版本(https://dfam.org)代表了从模式生物中转座元件家族的原理验证集合,演变为适用于广泛物种以及经过整理和未整理数据集的社区资源。此外,自Dfam 3.0以来的版本提供了辅助一致序列模型、转座元件蛋白比对以及一个形式化分类系统,以支持该资源中所代表的生物种类日益增加的多样性。最新版本包含了由欧洲生物信息研究所(EBI)贡献的来自336个物种的266,740个新的从头生成的转座元件家族。这一扩展展示了Dfam许多新功能的实用性,并为改进从头生成的转座元件数据集所面临的长期挑战提供了见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c123/7805219/80f4d8f1fd5f/13100_2020_230_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索