• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

蛋白质-DNA 结合的统计数据和哺乳动物基因组中转录因子的总结合位点数量。

Statistics of protein-DNA binding and the total number of binding sites for a transcription factor in the mammalian genome.

机构信息

Department of Genome and Gene Expression Data Analysis, Bioinformatics Institute, 30 Biopolis str, Singapore.

出版信息

BMC Genomics. 2010 Feb 10;11 Suppl 1(Suppl 1):S12. doi: 10.1186/1471-2164-11-S1-S12.

DOI:10.1186/1471-2164-11-S1-S12
PMID:20158869
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2822526/
Abstract

BACKGROUND

Transcription factor (TF)-DNA binding loci are explored by analyzing massive datasets generated with application of Chromatin Immuno-Precipitation (ChIP)-based high-throughput sequencing technologies. These datasets suffer from a bias in the information about binding loci availability, sample incompleteness and diverse sources of technical and biological noises. Therefore adequate mathematical models of ChIP-based high-throughput assay(s) and statistical tools are required for a robust identification of specific and reliable TF binding sites (TFBS), a precise characterization of TFBS avidity distribution and a plausible estimation the total number of specific TFBS for a given TF in the genome for a given cell type.

RESULTS

We developed an exploratory mixture probabilistic model for a specific and non-specific transcription factor-DNA (TF-DNA) binding. Within ChiP-seq data sets, the statistics of specific and non-specific DNA-protein binding is defined by a mixture of sample size-dependent skewed functions described by Kolmogorov-Waring (K-W) function (Kuznetsov, 2003) and exponential function, respectively. Using available Chip-seq data for eleven TFs, essential for self-maintenance and differentiation of mouse embryonic stem cells (SC) (Nanog, Oct4, sox2, KLf4, STAT3, E2F1, Tcfcp211, ZFX, n-Myc, c-Myc and Essrb) reported in Chen et al (2008), we estimated (i) the specificity and the sensitivity of the ChiP-seq binding assays and (ii) the number of specific but not identified in the current experiments binding sites (BSs) in the genome of mouse embryonic stem cells. Motif finding analysis applied to the identified c-Myc TFBSs supports our results and allowed us to predict many novel c-Myc target genes.

CONCLUSION

We provide a novel methodology of estimating the specificity and the sensitivity of TF-DNA binding in massively paralleled ChIP sequencing (ChIP-seq) binding assay. Goodness-of fit analysis of K-W functions suggests that a large fraction of low- and moderate- avidity TFBSs cannot be identified by the ChIP-based methods. Thus the task to identify the binding sensitivity of a TF cannot be technically resolved yet by current ChIP-seq, compared to former experimental techniques. Considering our improvement in measuring the sensitivity and the specificity of the TFs obtained from the ChIP-seq data, the models of transcriptional regulatory networks in embryonic cells and other cell types derived from the given ChIp-seq data should be carefully revised.

摘要

背景

转录因子(TF)-DNA 结合位点是通过分析应用染色质免疫沉淀(ChIP)高通量测序技术生成的大量数据集来探索的。这些数据集在绑定位置信息的可用性、样本不完整性以及技术和生物噪声的各种来源方面存在偏差。因此,需要足够的基于 ChIP 的高通量检测的数学模型和统计工具,以稳健地识别特定且可靠的 TF 结合位点(TFBS)、精确表征 TFBS 亲和力分布,并合理估计给定细胞类型中给定 TF 在基因组中的特定 TFBS 的总数。

结果

我们开发了一种用于特定和非特定转录因子-DNA(TF-DNA)结合的探索性混合概率模型。在 ChiP-seq 数据集中,特定和非特定 DNA-蛋白质结合的统计数据由样本大小相关的偏态函数的混合物定义,这些函数由 Kolmogorov-Waring(K-W)函数(Kuznetsov,2003)和指数函数分别描述。使用 Chen 等人(2008 年)报告的用于自我维持和分化的小鼠胚胎干细胞(SC)的 11 种 TF(Nanog、Oct4、sox2、Klf4、STAT3、E2F1、Tcfcp211、ZFX、n-Myc、c-Myc 和 Essrb)的现有 ChiP-seq 数据,我们估计了(i)ChiP-seq 结合测定的特异性和敏感性,以及(ii)在小鼠胚胎干细胞基因组中当前实验无法识别的特定但未识别的结合位点(BS)的数量。应用于鉴定的 c-Myc TFBS 的基序发现分析支持我们的结果,并允许我们预测许多新的 c-Myc 靶基因。

结论

我们提供了一种估计大规模并行 ChIP 测序(ChIP-seq)结合测定中 TF-DNA 结合特异性和敏感性的新方法。K-W 函数的拟合优度分析表明,很大一部分低亲和性和中亲和性 TFBS 无法通过基于 ChIP 的方法识别。因此,与以前的实验技术相比,当前的 ChIP-seq 技术尚无法解决确定 TF 结合敏感性的任务。考虑到我们从 ChiP-seq 数据中提高了测量 TF 的灵敏度和特异性的方法,应该仔细修改从给定 ChIP-seq 数据导出的胚胎细胞和其他细胞类型的转录调控网络模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48be/2822526/d18bc43d8758/1471-2164-11-S1-S12-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48be/2822526/bb329ca4a1ea/1471-2164-11-S1-S12-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48be/2822526/c14c222179bf/1471-2164-11-S1-S12-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48be/2822526/cf922b61a1c1/1471-2164-11-S1-S12-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48be/2822526/93b0312e9fca/1471-2164-11-S1-S12-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48be/2822526/da83ad30efac/1471-2164-11-S1-S12-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48be/2822526/d18bc43d8758/1471-2164-11-S1-S12-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48be/2822526/bb329ca4a1ea/1471-2164-11-S1-S12-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48be/2822526/c14c222179bf/1471-2164-11-S1-S12-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48be/2822526/cf922b61a1c1/1471-2164-11-S1-S12-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48be/2822526/93b0312e9fca/1471-2164-11-S1-S12-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48be/2822526/da83ad30efac/1471-2164-11-S1-S12-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48be/2822526/d18bc43d8758/1471-2164-11-S1-S12-6.jpg

相似文献

1
Statistics of protein-DNA binding and the total number of binding sites for a transcription factor in the mammalian genome.蛋白质-DNA 结合的统计数据和哺乳动物基因组中转录因子的总结合位点数量。
BMC Genomics. 2010 Feb 10;11 Suppl 1(Suppl 1):S12. doi: 10.1186/1471-2164-11-S1-S12.
2
Mathematical Modeling of Avidity Distribution and Estimating General Binding Properties of Transcription Factors from Genome-Wide Binding Profiles.亲和力分布的数学建模以及从全基因组结合谱估计转录因子的一般结合特性
Methods Mol Biol. 2017;1613:193-276. doi: 10.1007/978-1-4939-7027-8_9.
3
Relative avidity, specificity, and sensitivity of transcription factor-DNA binding in genome-scale experiments.基因组规模实验中转录因子与DNA结合的相对亲和力、特异性和灵敏度。
Methods Mol Biol. 2009;563:15-50. doi: 10.1007/978-1-60761-175-2_2.
4
Improving analysis of transcription factor binding sites within ChIP-Seq data based on topological motif enrichment.基于拓扑基序富集改进ChIP-Seq数据中转录因子结合位点的分析。
BMC Genomics. 2014 Jun 13;15(1):472. doi: 10.1186/1471-2164-15-472.
5
Transcription factor-binding k-mer analysis clarifies the cell type dependency of binding specificities and cis-regulatory SNPs in humans.转录因子结合 k- -mer 分析阐明了人类结合特异性和顺式调控 SNP 的细胞类型依赖性。
BMC Genomics. 2023 Oct 7;24(1):597. doi: 10.1186/s12864-023-09692-9.
6
A biophysical model for analysis of transcription factor interaction and binding site arrangement from genome-wide binding data.基于全基因组结合数据的转录因子相互作用和结合位点排列的生物物理模型分析。
PLoS One. 2009 Dec 1;4(12):e8155. doi: 10.1371/journal.pone.0008155.
7
Application of experimentally verified transcription factor binding sites models for computational analysis of ChIP-Seq data.经实验验证的转录因子结合位点模型在ChIP-Seq数据计算分析中的应用。
BMC Genomics. 2014 Jan 29;15(1):80. doi: 10.1186/1471-2164-15-80.
8
Computer and statistical analysis of transcription factor binding and chromatin modifications by ChIP-seq data in embryonic stem cell.通过胚胎干细胞中的ChIP-seq数据对转录因子结合和染色质修饰进行计算机和统计分析。
J Integr Bioinform. 2012 Sep 18;9(2):211. doi: 10.2390/biecoll-jib-2012-211.
9
An improved ChIP-seq peak detection system for simultaneously identifying post-translational modified transcription factors by combinatorial fusion, using SUMOylation as an example.一种改良的 ChIP-seq 峰检测系统,用于通过组合融合,以 SUMOylation 为例,同时鉴定翻译后修饰的转录因子。
BMC Genomics. 2014;15 Suppl 1(Suppl 1):S1. doi: 10.1186/1471-2164-15-S1-S1. Epub 2014 Jan 24.
10
Chromatin immunoprecipitation and multiplex sequencing (ChIP-Seq) to identify global transcription factor binding sites in the nematode Caenorhabditis elegans.染色质免疫沉淀与多重测序(ChIP-Seq)用于鉴定线虫秀丽隐杆线虫中的全局转录因子结合位点。
Methods Enzymol. 2014;539:89-111. doi: 10.1016/B978-0-12-420120-0.00007-4.

引用本文的文献

1
Coordinated Cross-Talk Between the Myc and Mlx Networks in Liver Regeneration and Neoplasia.Myc 和 Mlx 网络在肝再生和肿瘤发生中的协调对话。
Cell Mol Gastroenterol Hepatol. 2022;13(6):1785-1804. doi: 10.1016/j.jcmgh.2022.02.018. Epub 2022 Mar 5.
2
Multiple signatures of a disease in potential biomarker space: Getting the signatures consensus and identification of novel biomarkers.潜在生物标志物空间中疾病的多种特征:达成特征共识并识别新型生物标志物。
BMC Genomics. 2015;16 Suppl 7(Suppl 7):S2. doi: 10.1186/1471-2164-16-S7-S2. Epub 2015 Jun 11.
3
Role of IL-9 and STATs in hematological malignancies (Review).

本文引用的文献

1
microRNAs regulate human embryonic stem cell division.microRNAs 调控人类胚胎干细胞分裂。
Cell Cycle. 2009 Nov 15;8(22):3729-41. doi: 10.4161/cc.8.22.10033. Epub 2009 Nov 10.
2
Relative avidity, specificity, and sensitivity of transcription factor-DNA binding in genome-scale experiments.基因组规模实验中转录因子与DNA结合的相对亲和力、特异性和灵敏度。
Methods Mol Biol. 2009;563:15-50. doi: 10.1007/978-1-60761-175-2_2.
3
The oncogenic EWS-FLI1 protein binds in vivo GGAA microsatellite sequences with potential transcriptional activation function.
白细胞介素-9和信号转导及转录激活因子在血液系统恶性肿瘤中的作用(综述)
Oncol Lett. 2014 Mar;7(3):602-610. doi: 10.3892/ol.2013.1761. Epub 2013 Dec 16.
4
NEXT-peak: a normal-exponential two-peak model for peak-calling in ChIP-seq data.NEXT-peak:一种用于 ChIP-seq 数据峰调用的正态指数双峰模型。
BMC Genomics. 2013 May 25;14:349. doi: 10.1186/1471-2164-14-349.
5
Promoter hypermethylation-induced transcriptional down-regulation of the gene MYCT1 in laryngeal squamous cell carcinoma.喉鳞状细胞癌中启动子超甲基化诱导 MYCT1 基因转录下调。
BMC Cancer. 2012 Jun 6;12:219. doi: 10.1186/1471-2407-12-219.
6
Quantitative model of R-loop forming structures reveals a novel level of RNA-DNA interactome complexity.R 环形成结构的定量模型揭示了 RNA-DNA 相互作用组复杂性的新层次。
Nucleic Acids Res. 2012 Jan;40(2):e16. doi: 10.1093/nar/gkr1075. Epub 2011 Nov 25.
7
MYCT1-TV, a novel MYCT1 transcript, is regulated by c-Myc and may participate in laryngeal carcinogenesis.MYCT1-TV,一种新型 MYCT1 转录本,受 c-Myc 调控,可能参与喉癌的发生。
PLoS One. 2011;6(10):e25648. doi: 10.1371/journal.pone.0025648. Epub 2011 Oct 5.
致癌性EWS-FLI1蛋白在体内与具有潜在转录激活功能的GGAA微卫星序列结合。
PLoS One. 2009;4(3):e4932. doi: 10.1371/journal.pone.0004932. Epub 2009 Mar 23.
4
Evidence that human blastomere cleavage is under unique cell cycle control.有证据表明人类卵裂球分裂受独特的细胞周期控制。
J Assist Reprod Genet. 2009 Apr;26(4):187-95. doi: 10.1007/s10815-009-9306-x. Epub 2009 Mar 14.
5
Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data.基于染色质免疫沉淀测序(ChIP-Seq)数据的转录因子结合位点全基因组分析。
Nat Methods. 2008 Sep;5(9):829-34. doi: 10.1038/nmeth.1246.
6
Reflecting on 25 years with MYC.回顾与MYC相伴的25年。
Nat Rev Cancer. 2008 Dec;8(12):976-90. doi: 10.1038/nrc2231.
7
An integrated software system for analyzing ChIP-chip and ChIP-seq data.一个用于分析染色质免疫沉淀芯片(ChIP-chip)和染色质免疫沉淀测序(ChIP-seq)数据的集成软件系统。
Nat Biotechnol. 2008 Nov;26(11):1293-300. doi: 10.1038/nbt.1505. Epub 2008 Nov 2.
8
Modeling ChIP sequencing in silico with applications.在计算机上模拟染色质免疫沉淀测序及其应用
PLoS Comput Biol. 2008 Aug 22;4(8):e1000158. doi: 10.1371/journal.pcbi.1000158.
9
SeqMap: mapping massive amount of oligonucleotides to the genome.SeqMap:将大量寡核苷酸映射到基因组。
Bioinformatics. 2008 Oct 15;24(20):2395-6. doi: 10.1093/bioinformatics/btn429. Epub 2008 Aug 12.
10
Extracting sequence features to predict protein-DNA interactions: a comparative study.提取序列特征以预测蛋白质 - DNA 相互作用:一项比较研究。
Nucleic Acids Res. 2008 Jul;36(12):4137-48. doi: 10.1093/nar/gkn361. Epub 2008 Jun 13.