BiasAway：用于生成核苷酸组成匹配的 DNA 背景序列的命令行和网络服务器。

BiasAway: command-line and web server to generate nucleotide composition-matched DNA background sequences.

机构信息

Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, 0349 Oslo, Norway.

Stanford University School of Medicine, Stanford Cancer Institute, Stanford, CA 94304, USA.

出版信息

Bioinformatics. 2021 Jul 12;37(11):1607-1609. doi: 10.1093/bioinformatics/btaa928.

DOI:10.1093/bioinformatics/btaa928

PMID:33135764

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8275979/

Abstract

MOTIVATION

Accurate motif enrichment analyses depend on the choice of background DNA sequences used, which should ideally match the sequence composition of the foreground sequences. It is important to avoid false positive enrichment due to sequence biases in the genome, such as GC-bias. Therefore, relying on an appropriate set of background sequences is crucial for enrichment analysis.

RESULTS

We developed BiasAway, a command line tool and its dedicated easy-to-use web server to generate synthetic sequences matching any k-mer nucleotide composition or select genomic DNA sequences matching the mononucleotide composition of the foreground sequences through four different models. For genomic sequences, we provide precomputed partitions of genomes from nine species with five different bin sizes to generate appropriate genomic background sequences.

AVAILABILITY AND IMPLEMENTATION

BiasAway source code is freely available from Bitbucket (https://bitbucket.org/CBGR/biasaway) and can be easily installed using bioconda or pip. The web server is available at https://biasaway.uio.no and a detailed documentation is available at https://biasaway.readthedocs.io.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

准确的基序富集分析取决于所使用的背景 DNA 序列的选择，这些序列在理想情况下应与前景序列的序列组成相匹配。避免由于基因组中的序列偏差（如 GC 偏倚）而导致假阳性富集非常重要。因此，依赖于适当的背景序列集对于富集分析至关重要。

结果

我们开发了 BiasAway，这是一个命令行工具及其专用的易于使用的网络服务器，可通过四种不同的模型生成与任何 k-mer 核苷酸组成匹配的合成序列，或通过选择与前景序列的单核苷酸组成匹配的基因组 DNA 序列。对于基因组序列，我们提供了来自九个物种的基因组预先计算的分区，具有五个不同的 bin 大小，以生成适当的基因组背景序列。

可用性和实现

BiasAway 的源代码可从 Bitbucket（https://bitbucket.org/CBGR/biasaway）免费获得，并且可以使用 bioconda 或 pip 轻松安装。网络服务器可在 https://biasaway.uio.no 上使用，并且在 https://biasaway.readthedocs.io 上提供了详细的文档。

补充信息

补充数据可在 Bioinformatics 在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2509/8275979/f793e2a7f260/btaa928f1.jpg

相似文献

BiasAway: command-line and web server to generate nucleotide composition-matched DNA background sequences.

Bioinformatics. 2021 Jul 12;37(11):1607-1609. doi: 10.1093/bioinformatics/btaa928.

GLANET: genomic loci annotation and enrichment tool.

Bioinformatics. 2017 Sep 15;33(18):2818-2828. doi: 10.1093/bioinformatics/btx326.

ASCIIGenome: a command line genome browser for console terminals.

Bioinformatics. 2017 May 15;33(10):1568-1569. doi: 10.1093/bioinformatics/btx007.

pyGenomeTracks: reproducible plots for multivariate genomic datasets.

Bioinformatics. 2021 Apr 20;37(3):422-423. doi: 10.1093/bioinformatics/btaa692.

Intervene: a tool for intersection and visualization of multiple gene or genomic region sets.

BMC Bioinformatics. 2017 May 31;18(1):287. doi: 10.1186/s12859-017-1708-7.

GeneNoteBook, a collaborative notebook for comparative genomics.

Bioinformatics. 2019 Nov 1;35(22):4779-4781. doi: 10.1093/bioinformatics/btz491.

Goldilocks: a tool for identifying genomic regions that are 'just right'.

Bioinformatics. 2016 Jul 1;32(13):2047-9. doi: 10.1093/bioinformatics/btw116. Epub 2016 Mar 7.

ODGI: understanding pangenome graphs.

Bioinformatics. 2022 Jun 27;38(13):3319-3326. doi: 10.1093/bioinformatics/btac308.

PERF: an exhaustive algorithm for ultra-fast and efficient identification of microsatellites from large DNA sequences.

Bioinformatics. 2018 Mar 15;34(6):943-948. doi: 10.1093/bioinformatics/btx721.

CONSTAX2: improved taxonomic classification of environmental DNA markers.

Bioinformatics. 2021 Nov 5;37(21):3941-3943. doi: 10.1093/bioinformatics/btab347.

引用本文的文献

Cross-platform DNA motif discovery and benchmarking to explore binding specificities of poorly studied human transcription factors.

bioRxiv. 2024 Nov 13:2024.11.11.619379. doi: 10.1101/2024.11.11.619379.

Identification of transcription factor co-binding patterns with non-negative matrix factorization.

Nucleic Acids Res. 2024 Oct 14;52(18):e85. doi: 10.1093/nar/gkae743.

Regression convolutional neural network models implicate peripheral immune regulatory variants in the predisposition to Alzheimer's disease.

PLoS Comput Biol. 2024 Aug 26;20(8):e1012356. doi: 10.1371/journal.pcbi.1012356. eCollection 2024 Aug.

Genomic background sequences systematically outperform synthetic ones in de novo motif discovery for ChIP-seq data.

NAR Genom Bioinform. 2024 Jul 27;6(3):lqae090. doi: 10.1093/nargab/lqae090. eCollection 2024 Sep.

Rare variation in non-coding regions with evolutionary signatures contributes to autism spectrum disorder risk.

Cell Genom. 2024 Aug 14;4(8):100609. doi: 10.1016/j.xgen.2024.100609. Epub 2024 Jul 16.

Regulatory activity is the default DNA state in eukaryotes.

Nat Struct Mol Biol. 2024 Mar;31(3):559-567. doi: 10.1038/s41594-024-01235-4. Epub 2024 Mar 6.

Transcriptional reprogramming by mutated IRF4 in lymphoma.

Nat Commun. 2023 Nov 7;14(1):6947. doi: 10.1038/s41467-023-41954-8.

Design and deep learning of synthetic B-cell-specific promoters.

Nucleic Acids Res. 2023 Nov 27;51(21):11967-11979. doi: 10.1093/nar/gkad930.

ExplaiNN: interpretable and transparent neural networks for genomics.

Genome Biol. 2023 Jun 27;24(1):154. doi: 10.1186/s13059-023-02985-y.

A multimorphic mutation in IRF4 causes human autosomal dominant combined immunodeficiency.

Sci Immunol. 2023 Jan 20;8(79):eade7953. doi: 10.1126/sciimmunol.ade7953.

本文引用的文献

SciPy 1.0: fundamental algorithms for scientific computing in Python.

Nat Methods. 2020 Mar;17(3):261-272. doi: 10.1038/s41592-019-0686-2. Epub 2020 Feb 3.

The Human Transcription Factors.

Cell. 2018 Oct 4;175(2):598-599. doi: 10.1016/j.cell.2018.09.045.

Bioconda: sustainable and comprehensive software distribution for the life sciences.

Nat Methods. 2018 Jul;15(7):475-476. doi: 10.1038/s41592-018-0046-7.

RSAT 2018: regulatory sequence analysis tools 20th anniversary.

Nucleic Acids Res. 2018 Jul 2;46(W1):W209-W214. doi: 10.1093/nar/gky317.

Identification of Human Lineage-Specific Transcriptional Coregulators Enabled by a Glossary of Binding Modules and Tunable Genomic Backgrounds.

Cell Syst. 2017 Dec 27;5(6):654. doi: 10.1016/j.cels.2017.12.011.

Analysis of Genomic Sequence Motifs for Deciphering Transcription Factor Binding and Transcriptional Regulation in Eukaryotic Cells.

Front Genet. 2016 Feb 23;7:24. doi: 10.3389/fgene.2016.00024. eCollection 2016.

Integrative analysis of 111 reference human epigenomes.

Nature. 2015 Feb 19;518(7539):317-30. doi: 10.1038/nature14248.

Determination and inference of eukaryotic transcription factor sequence specificity.

Cell. 2014 Sep 11;158(6):1431-1443. doi: 10.1016/j.cell.2014.08.009.

Improving analysis of transcription factor binding sites within ChIP-Seq data based on topological motif enrichment.

BMC Genomics. 2014 Jun 13;15(1):472. doi: 10.1186/1471-2164-15-472.

The limits of de novo DNA motif discovery.

PLoS One. 2012;7(11):e47836. doi: 10.1371/journal.pone.0047836. Epub 2012 Nov 7.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

BiasAway：用于生成核苷酸组成匹配的 DNA 背景序列的命令行和网络服务器。

BiasAway: command-line and web server to generate nucleotide composition-matched DNA background sequences.

机构信息

Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, 0349 Oslo, Norway.

Stanford University School of Medicine, Stanford Cancer Institute, Stanford, CA 94304, USA.

出版信息

Bioinformatics. 2021 Jul 12;37(11):1607-1609. doi: 10.1093/bioinformatics/btaa928.

DOI:10.1093/bioinformatics/btaa928

PMID:33135764

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8275979/

Abstract

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

结果

可用性和实现

补充信息

补充数据可在 Bioinformatics 在线获得。

BiasAway：用于生成核苷酸组成匹配的 DNA 背景序列的命令行和网络服务器。

BiasAway: command-line and web server to generate nucleotide composition-matched DNA background sequences.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

补充信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

BiasAway：用于生成核苷酸组成匹配的 DNA 背景序列的命令行和网络服务器。

BiasAway: command-line and web server to generate nucleotide composition-matched DNA background sequences.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

补充信息

相似文献

引用本文的文献

本文引用的文献