• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

重复DNA家族的Dfam数据库。

The Dfam database of repetitive DNA families.

作者信息

Hubley Robert, Finn Robert D, Clements Jody, Eddy Sean R, Jones Thomas A, Bao Weidong, Smit Arian F A, Wheeler Travis J

机构信息

Institute for Systems Biology, Seattle, WA 98109, USA

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1RQ, UK.

出版信息

Nucleic Acids Res. 2016 Jan 4;44(D1):D81-9. doi: 10.1093/nar/gkv1272. Epub 2015 Nov 26.

DOI:10.1093/nar/gkv1272
PMID:26612867
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4702899/
Abstract

Repetitive DNA, especially that due to transposable elements (TEs), makes up a large fraction of many genomes. Dfam is an open access database of families of repetitive DNA elements, in which each family is represented by a multiple sequence alignment and a profile hidden Markov model (HMM). The initial release of Dfam, featured in the 2013 NAR Database Issue, contained 1143 families of repetitive elements found in humans, and was used to produce more than 100 Mb of additional annotation of TE-derived regions in the human genome, with improved speed. Here, we describe recent advances, most notably expansion to 4150 total families including a comprehensive set of known repeat families from four new organisms (mouse, zebrafish, fly and nematode). We describe improvements to coverage, and to our methods for identifying and reducing false annotation. We also describe updates to the website interface. The Dfam website has moved to http://dfam.org. Seed alignments, profile HMMs, hit lists and other underlying data are available for download.

摘要

重复DNA,尤其是由转座元件(TEs)产生的重复DNA,在许多基因组中占很大比例。Dfam是一个关于重复DNA元件家族的开放获取数据库,其中每个家族由一个多序列比对和一个轮廓隐马尔可夫模型(HMM)表示。2013年《核酸研究》数据库专刊中介绍的Dfam初始版本包含在人类中发现的1143个重复元件家族,并被用于以更高的速度对人类基因组中超过100 Mb的转座元件衍生区域进行额外注释。在此,我们描述了近期的进展,最显著的是扩展到总共4150个家族,包括来自四种新生物(小鼠、斑马鱼、果蝇和线虫)的一套全面的已知重复家族。我们描述了在覆盖范围以及识别和减少错误注释方法方面的改进。我们还描述了网站界面的更新。Dfam网站已迁移至http://dfam.org。种子比对、轮廓HMM、命中列表和其他基础数据可供下载。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1943/4702899/77fec29dd398/gkv1272fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1943/4702899/4c6af4511504/gkv1272fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1943/4702899/0439fd4e1613/gkv1272fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1943/4702899/6a599fea6ba3/gkv1272fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1943/4702899/3079a49fa58f/gkv1272fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1943/4702899/f2a364c2f09c/gkv1272fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1943/4702899/77fec29dd398/gkv1272fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1943/4702899/4c6af4511504/gkv1272fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1943/4702899/0439fd4e1613/gkv1272fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1943/4702899/6a599fea6ba3/gkv1272fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1943/4702899/3079a49fa58f/gkv1272fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1943/4702899/f2a364c2f09c/gkv1272fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1943/4702899/77fec29dd398/gkv1272fig6.jpg

相似文献

1
The Dfam database of repetitive DNA families.重复DNA家族的Dfam数据库。
Nucleic Acids Res. 2016 Jan 4;44(D1):D81-9. doi: 10.1093/nar/gkv1272. Epub 2015 Nov 26.
2
Dfam: a database of repetitive DNA based on profile hidden Markov models.Dfam:基于隐马尔可夫模型的重复 DNA 数据库。
Nucleic Acids Res. 2013 Jan;41(Database issue):D70-82. doi: 10.1093/nar/gks1265. Epub 2012 Nov 30.
3
Transposable element annotation of the rice genome.水稻基因组的转座元件注释
Bioinformatics. 2004 Jan 22;20(2):155-60. doi: 10.1093/bioinformatics/bth019.
4
msRepDB: a comprehensive repetitive sequence database of over 80 000 species.msRepDB:一个涵盖超过 80000 个物种的综合重复序列数据库。
Nucleic Acids Res. 2022 Jan 7;50(D1):D236-D245. doi: 10.1093/nar/gkab1089.
5
nhmmer: DNA homology search with profile HMMs.nhmmer:使用轮廓隐马尔可夫模型进行 DNA 同源搜索。
Bioinformatics. 2013 Oct 1;29(19):2487-9. doi: 10.1093/bioinformatics/btt403. Epub 2013 Jul 9.
6
BmTEdb: a collective database of transposable elements in the silkworm genome.BmTEdb:家蚕基因组中转座元件的集体数据库。
Database (Oxford). 2013 Jul 25;2013:bat055. doi: 10.1093/database/bat055. Print 2013.
7
The Dfam community resource of transposable element families, sequence models, and genome annotations.转座元件家族、序列模型和基因组注释的Dfam社区资源。
Mob DNA. 2021 Jan 12;12(1):2. doi: 10.1186/s13100-020-00230-y.
8
Orthoptera-TElib: a library of Orthoptera transposable elements for TE annotation.直翅目转座元件文库(Orthoptera-TElib):用于转座元件注释的直翅目转座元件文库。
Mob DNA. 2024 Mar 15;15(1):5. doi: 10.1186/s13100-024-00316-x.
9
Rfam 11.0: 10 years of RNA families.RFAM 11.0:10 年的 RNA 家族。
Nucleic Acids Res. 2013 Jan;41(Database issue):D226-32. doi: 10.1093/nar/gks1005. Epub 2012 Nov 3.
10
RepeatModeler2 for automated genomic discovery of transposable element families.RepeatModeler2 用于自动发现转座元件家族的基因组。
Proc Natl Acad Sci U S A. 2020 Apr 28;117(17):9451-9457. doi: 10.1073/pnas.1921046117. Epub 2020 Apr 16.

引用本文的文献

1
A High-Quality Chromosome-Level Genome Assembly and Comparative Analyses Provide Insights into the Adaptation of (Fabricius, 1794) (Diptera: Calliphoridae).高质量的染色体水平基因组组装及比较分析为红头丽蝇(法布里丘斯,1794年)(双翅目:丽蝇科)的适应性研究提供了见解。
Biology (Basel). 2025 Jul 22;14(8):913. doi: 10.3390/biology14080913.
2
A chromosomal-level genome assembly of Omiodes indicata Fabricius (Lepidoptera: Crambidae).印度谷螟(鳞翅目:草螟科)的染色体水平基因组组装
Sci Data. 2025 Aug 29;12(1):1514. doi: 10.1038/s41597-025-05644-y.
3
Insect Phylogenomics: From Experiment Planning to Post-phylogenetic Analyses.

本文引用的文献

1
Repbase Update, a database of repetitive elements in eukaryotic genomes.Repbase Update,一个真核生物基因组中重复元件的数据库。
Mob DNA. 2015 Jun 2;6:11. doi: 10.1186/s13100-015-0041-9. eCollection 2015.
2
Type material in the NCBI Taxonomy Database.美国国立生物技术信息中心分类数据库中的模式标本。
Nucleic Acids Res. 2015 Jan;43(Database issue):D1086-98. doi: 10.1093/nar/gku1127. Epub 2014 Nov 14.
3
Mammalian-wide interspersed repeat (MIR)-derived enhancers and the regulation of human gene expression.哺乳动物广泛散布的重复序列(MIR)衍生增强子与人类基因表达的调控。
昆虫系统发育基因组学:从实验规划到系统发育后分析
Methods Mol Biol. 2025;2935:211-235. doi: 10.1007/978-1-0716-4583-3_9.
4
A chromosome-level genome assembly of Sarcophaga princeps Wiedemann, 1830 (Diptera: Sarcophagidae).1830年维德曼氏肉蝇(双翅目:麻蝇科)的染色体水平基因组组装
Sci Data. 2025 Aug 15;12(1):1433. doi: 10.1038/s41597-025-05785-0.
5
The impact of environmental exposures on the epigenomic and transcriptomic landscape of transposable elements.环境暴露对转座元件的表观基因组和转录组格局的影响。
bioRxiv. 2025 Jul 31:2025.07.28.667212. doi: 10.1101/2025.07.28.667212.
6
Chromosome-level genome of the shining chafers Kibakoganea tamdaoensis (Coleoptera: Scarabaeidae: Rutelinae).光鳃金龟(Kibakoganea tamdaoensis)(鞘翅目:金龟科:丽金龟亚科)的染色体水平基因组
Sci Data. 2025 Aug 1;12(1):1345. doi: 10.1038/s41597-025-05657-7.
7
Chromosome-level genome assembly of Ampulex clypecomplana Chen & Li (Hymenoptera: Ampulicidae).陈氏扁足泥蜂(膜翅目:扁足泥蜂科)的染色体水平基因组组装
Sci Data. 2025 Jul 30;12(1):1328. doi: 10.1038/s41597-025-05676-4.
8
The Deep Mining Era: Genomic, Metabolomic, and Integrative Approaches to Microbial Natural Products from 2018 to 2024.深度挖掘时代:2018年至2024年微生物天然产物的基因组学、代谢组学及综合方法
Mar Drugs. 2025 Jun 23;23(7):261. doi: 10.3390/md23070261.
9
Chromosome-level genome assembly of the large carpenter bee Xylocopa dejeanii Lepeletier, 1841 (Hymenoptera: Apidae).大木蜂(Xylocopa dejeanii Lepeletier,1841年)(膜翅目:蜜蜂科)的染色体水平基因组组装
Sci Data. 2025 Jul 23;12(1):1280. doi: 10.1038/s41597-025-05641-1.
10
Genome resource announcement of a sp. fungus isolated from roots of broadleaf plants in Wisconsin, USA.从美国威斯康星州阔叶植物根部分离出的一种 sp. 真菌的基因组资源公告。
Microbiol Resour Announc. 2025 Aug 14;14(8):e0093624. doi: 10.1128/mra.00936-24. Epub 2025 Jul 23.
Mob DNA. 2014 May 5;5:14. doi: 10.1186/1759-8753-5-14. eCollection 2014.
4
Realistic artificial DNA sequences as negative controls for computational genomics.用于计算基因组学的逼真人工DNA序列作为阴性对照
Nucleic Acids Res. 2014 Jul;42(12):e99. doi: 10.1093/nar/gku356. Epub 2014 May 6.
5
Adjusting scoring matrices to correct overextended alignments.调整评分矩阵以纠正过度延伸的比对。
Bioinformatics. 2013 Dec 1;29(23):3007-13. doi: 10.1093/bioinformatics/btt517. Epub 2013 Aug 31.
6
nhmmer: DNA homology search with profile HMMs.nhmmer:使用轮廓隐马尔可夫模型进行 DNA 同源搜索。
Bioinformatics. 2013 Oct 1;29(19):2487-9. doi: 10.1093/bioinformatics/btt403. Epub 2013 Jul 9.
7
HAL: a hierarchical format for storing and analyzing multiple genome alignments.HAL:一种用于存储和分析多个基因组比对的层次格式。
Bioinformatics. 2013 May 15;29(10):1341-2. doi: 10.1093/bioinformatics/btt128. Epub 2013 Mar 16.
8
Dfam: a database of repetitive DNA based on profile hidden Markov models.Dfam:基于隐马尔可夫模型的重复 DNA 数据库。
Nucleic Acids Res. 2013 Jan;41(Database issue):D70-82. doi: 10.1093/nar/gks1265. Epub 2012 Nov 30.
9
Cactus: Algorithms for genome multiple sequence alignment.仙人掌:基因组多重序列比对算法。
Genome Res. 2011 Sep;21(9):1512-28. doi: 10.1101/gr.123356.111. Epub 2011 Jun 10.
10
Considering transposable element diversification in de novo annotation approaches.考虑从头注释方法中转座元件的多样化。
PLoS One. 2011 Jan 31;6(1):e16526. doi: 10.1371/journal.pone.0016526.