TF-Prioritizer：一个用于优先考虑特定条件转录因子的 Java 流水线。

TF-Prioritizer: a Java pipeline to prioritize condition-specific transcription factors.

机构信息

Big Data in BioMedicine Group, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising D-85354, Germany.

Institute for Advanced Study, Technical University of Munich, Garching D-85748, Germany.

出版信息

Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad026. Epub 2023 May 3.

DOI:10.1093/gigascience/giad026

PMID:37132521

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10155229/

Abstract

BACKGROUND

Eukaryotic gene expression is controlled by cis-regulatory elements (CREs), including promoters and enhancers, which are bound by transcription factors (TFs). Differential expression of TFs and their binding affinity at putative CREs determine tissue- and developmental-specific transcriptional activity. Consolidating genomic datasets can offer further insights into the accessibility of CREs, TF activity, and, thus, gene regulation. However, the integration and analysis of multimodal datasets are hampered by considerable technical challenges. While methods for highlighting differential TF activity from combined chromatin state data (e.g., chromatin immunoprecipitation [ChIP], ATAC, or DNase sequencing) and RNA sequencing data exist, they do not offer convenient usability, have limited support for large-scale data processing, and provide only minimal functionality for visually interpreting results.

RESULTS

We developed TF-Prioritizer, an automated pipeline that prioritizes condition-specific TFs from multimodal data and generates an interactive web report. We demonstrated its potential by identifying known TFs along with their target genes, as well as previously unreported TFs active in lactating mouse mammary glands. Additionally, we studied a variety of ENCODE datasets for cell lines K562 and MCF-7, including 12 histone modification ChIP sequencing as well as ATAC and DNase sequencing datasets, where we observe and discuss assay-specific differences.

CONCLUSION

TF-Prioritizer accepts ATAC, DNase, or ChIP sequencing and RNA sequencing data as input and identifies TFs with differential activity, thus offering an understanding of genome-wide gene regulation, potential pathogenesis, and therapeutic targets in biomedical research.

摘要

背景

真核基因表达受顺式调控元件（CREs）调控，包括启动子和增强子，它们与转录因子（TFs）结合。TFs 的差异表达及其在假定 CREs 上的结合亲和力决定了组织和发育特异性的转录活性。整合基因组数据集可以进一步深入了解 CREs 的可及性、TF 活性，从而了解基因调控。然而，多模态数据集的整合和分析受到相当大的技术挑战的阻碍。虽然有方法可以从组合染色质状态数据（例如染色质免疫沉淀 [ChIP]、ATAC 或 DNase 测序）和 RNA 测序数据中突出显示差异 TF 活性，但它们不方便使用，对大规模数据处理的支持有限，并且仅提供用于直观解释结果的最小功能。

结果

我们开发了 TF-Prioritizer，这是一个自动化管道，可从多模态数据中优先考虑条件特异性 TF，并生成交互式网络报告。我们通过识别已知 TF 及其靶基因，以及在泌乳期小鼠乳腺中活跃的以前未报告的 TF，证明了其潜力。此外，我们研究了各种 ENCODE 数据集，包括 K562 和 MCF-7 细胞系的 12 种组蛋白修饰 ChIP 测序以及 ATAC 和 DNase 测序数据集，我们观察并讨论了特定于检测的差异。

结论

TF-Prioritizer 接受 ATAC、DNase 或 ChIP 测序和 RNA 测序数据作为输入，并识别具有差异活性的 TFs，从而提供对全基因组基因调控、潜在发病机制和生物医学研究中的治疗靶点的理解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1db8/10155229/80765eb3b8ca/giad026fig1.jpg

相似文献

TF-Prioritizer: a Java pipeline to prioritize condition-specific transcription factors.

Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad026. Epub 2023 May 3.

De novo prediction of cis-regulatory elements and modules through integrative analysis of a large number of ChIP datasets.

BMC Genomics. 2014 Dec 2;15:1047. doi: 10.1186/1471-2164-15-1047.

Sequence and chromatin determinants of cell-type-specific transcription factor binding.

Genome Res. 2012 Sep;22(9):1723-34. doi: 10.1101/gr.127712.111.

Profiling of chromatin accessibility and identification of general cis-regulatory mechanisms that control two ocular lens differentiation pathways.

Epigenetics Chromatin. 2019 May 3;12(1):27. doi: 10.1186/s13072-019-0272-y.

Discovery of directional and nondirectional pioneer transcription factors by modeling DNase profile magnitude and shape.

Nat Biotechnol. 2014 Feb;32(2):171-178. doi: 10.1038/nbt.2798. Epub 2014 Jan 19.

Transcription factor-binding k-mer analysis clarifies the cell type dependency of binding specificities and cis-regulatory SNPs in humans.

BMC Genomics. 2023 Oct 7;24(1):597. doi: 10.1186/s12864-023-09692-9.

Profiling of chromatin accessibility identifies transcription factor binding sites across the genome of Aspergillus species.

BMC Biol. 2021 Sep 6;19(1):189. doi: 10.1186/s12915-021-01114-0.

hTFtarget: A Comprehensive Database for Regulations of Human Transcription Factors and Their Targets.

Genomics Proteomics Bioinformatics. 2020 Apr;18(2):120-128. doi: 10.1016/j.gpb.2019.09.006. Epub 2020 Aug 26.

Revealing transcription factor and histone modification co-localization and dynamics across cell lines by integrating ChIP-seq and RNA-seq data.

BMC Genomics. 2018 Dec 31;19(Suppl 10):914. doi: 10.1186/s12864-018-5278-5.

Genome binding properties of Zic transcription factors underlie their changing functions during neuronal maturation.

BMC Biol. 2024 Sep 2;22(1):189. doi: 10.1186/s12915-024-01989-9.

引用本文的文献

Data-driven projections of candidate enhancer-activating SNPs in immune regulation.

BMC Genomics. 2025 Feb 26;26(1):197. doi: 10.1186/s12864-025-11374-7.

Spotlight on amino acid changing mutations in the JAK-STAT pathway: from disease-specific mutation to general mutation databases.

Sci Rep. 2025 Feb 20;15(1):6202. doi: 10.1038/s41598-025-90788-5.

aws-s3-integrity-check: an open-source bash tool to verify the integrity of a dataset stored on Amazon S3.

GigaByte. 2023 Aug 23;2023:gigabyte87. doi: 10.46471/gigabyte.87. eCollection 2023.

本文引用的文献

ChIP-Atlas 2021 update: a data-mining suite for exploring epigenomic landscapes by fully integrating ChIP-seq, ATAC-seq and Bisulfite-seq data.

Nucleic Acids Res. 2022 Jul 5;50(W1):W175-W182. doi: 10.1093/nar/gkac199.

ELF5 inhibits the proliferation and invasion of breast cancer cells by regulating CD24.

Mol Biol Rep. 2021 Jun;48(6):5023-5032. doi: 10.1007/s11033-021-06495-7. Epub 2021 Jun 19.

Prime editing - an update on the field.

Gene Ther. 2021 Aug;28(7-8):396-401. doi: 10.1038/s41434-021-00263-9. Epub 2021 May 24.

Redundant and non-redundant cytokine-activated enhancers control Csn1s2b expression in the lactating mouse mammary gland.

Nat Commun. 2021 Apr 14;12(1):2239. doi: 10.1038/s41467-021-22500-w.

Acetylation of ELF5 suppresses breast cancer progression by promoting its degradation and targeting CCND1.

NPJ Precis Oncol. 2021 Mar 19;5(1):20. doi: 10.1038/s41698-021-00158-3.

Molecular and computational approaches to map regulatory elements in 3D chromatin structure.

Epigenetics Chromatin. 2021 Mar 19;14(1):14. doi: 10.1186/s13072-021-00390-y.

Regulation and functions of the RhoA regulatory guanine nucleotide exchange factor GEF-H1.

Small GTPases. 2021 Sep-Nov;12(5-6):358-371. doi: 10.1080/21541248.2020.1840889. Epub 2020 Oct 30.

MAGIC: A tool for predicting transcription factors and cofactors driving gene sets using ENCODE data.

PLoS Comput Biol. 2020 Apr 6;16(4):e1007800. doi: 10.1371/journal.pcbi.1007800. eCollection 2020 Apr.

The nf-core framework for community-curated bioinformatics pipelines.

Nat Biotechnol. 2020 Mar;38(3):276-278. doi: 10.1038/s41587-020-0439-x.

Robustness and applicability of transcription factor and pathway analysis tools on single-cell RNA-seq data.

Genome Biol. 2020 Feb 12;21(1):36. doi: 10.1186/s13059-020-1949-z.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

TF-Prioritizer：一个用于优先考虑特定条件转录因子的 Java 流水线。

TF-Prioritizer: a Java pipeline to prioritize condition-specific transcription factors.

机构信息

Big Data in BioMedicine Group, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising D-85354, Germany.

Institute for Advanced Study, Technical University of Munich, Garching D-85748, Germany.