• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

轻松实现DNA家族结合图谱:多种基序比对和聚类策略的比较

DNA familial binding profiles made easy: comparison of various motif alignment and clustering strategies.

作者信息

Mahony Shaun, Auron Philip E, Benos Panayiotis V

机构信息

Department of Computational Biology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America.

出版信息

PLoS Comput Biol. 2007 Mar 30;3(3):e61. doi: 10.1371/journal.pcbi.0030061. Epub 2007 Feb 15.

DOI:10.1371/journal.pcbi.0030061
PMID:17397256
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1848003/
Abstract

Transcription factor (TF) proteins recognize a small number of DNA sequences with high specificity and control the expression of neighbouring genes. The evolution of TF binding preference has been the subject of a number of recent studies, in which generalized binding profiles have been introduced and used to improve the prediction of new target sites. Generalized profiles are generated by aligning and merging the individual profiles of related TFs. However, the distance metrics and alignment algorithms used to compare the binding profiles have not yet been fully explored or optimized. As a result, binding profiles depend on TF structural information and sometimes may ignore important distinctions between subfamilies. Prediction of the identity or the structural class of a protein that binds to a given DNA pattern will enhance the analysis of microarray and ChIP-chip data where frequently multiple putative targets of usually unknown TFs are predicted. Various comparison metrics and alignment algorithms are evaluated (a total of 105 combinations). We find that local alignments are generally better than global alignments at detecting eukaryotic DNA motif similarities, especially when combined with the sum of squared distances or Pearson's correlation coefficient comparison metrics. In addition, multiple-alignment strategies for binding profiles and tree-building methods are tested for their efficiency in constructing generalized binding models. A new method for automatic determination of the optimal number of clusters is developed and applied in the construction of a new set of familial binding profiles which improves upon TF classification accuracy. A software tool, STAMP, is developed to host all tested methods and make them publicly available. This work provides a high quality reference set of familial binding profiles and the first comprehensive platform for analysis of DNA profiles. Detecting similarities between DNA motifs is a key step in the comparative study of transcriptional regulation, and the work presented here will form the basis for tool and method development for future transcriptional modeling studies.

摘要

转录因子(TF)蛋白能够高度特异性地识别少数DNA序列,并控制相邻基因的表达。TF结合偏好的进化一直是近期多项研究的主题,其中引入了广义结合谱并用于改进新靶位点的预测。广义谱是通过比对和合并相关TF的个体谱生成的。然而,用于比较结合谱的距离度量和比对算法尚未得到充分探索或优化。因此,结合谱依赖于TF结构信息,有时可能会忽略亚家族之间的重要差异。预测与给定DNA模式结合的蛋白质的身份或结构类别,将增强对微阵列和芯片数据的分析,在这些数据中,通常会预测多个通常未知TF的假定靶标。对各种比较度量和比对算法进行了评估(总共105种组合)。我们发现,在检测真核生物DNA基序相似性方面,局部比对通常优于全局比对,特别是当与平方距离之和或皮尔逊相关系数比较度量结合使用时。此外,还测试了结合谱的多重比对策略和建树方法在构建广义结合模型中的效率。开发了一种自动确定最佳聚类数的新方法,并将其应用于构建一组新的家族性结合谱,从而提高了TF分类的准确性。开发了一个软件工具STAMP,用于承载所有测试方法并使其公开可用。这项工作提供了一组高质量的家族性结合谱参考集以及第一个用于DNA谱分析的综合平台。检测DNA基序之间的相似性是转录调控比较研究中的关键步骤,本文介绍的工作将为未来转录建模研究的工具和方法开发奠定基础。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b32d/1848003/2248488ab92b/pcbi.0030061.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b32d/1848003/1f090c7d853d/pcbi.0030061.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b32d/1848003/dd48c9e6b587/pcbi.0030061.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b32d/1848003/c1ce12c88efd/pcbi.0030061.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b32d/1848003/4e6ced30a640/pcbi.0030061.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b32d/1848003/fb56f713f045/pcbi.0030061.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b32d/1848003/aad7f6abb980/pcbi.0030061.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b32d/1848003/6923032143ce/pcbi.0030061.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b32d/1848003/574e0caa3a3d/pcbi.0030061.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b32d/1848003/2248488ab92b/pcbi.0030061.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b32d/1848003/1f090c7d853d/pcbi.0030061.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b32d/1848003/dd48c9e6b587/pcbi.0030061.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b32d/1848003/c1ce12c88efd/pcbi.0030061.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b32d/1848003/4e6ced30a640/pcbi.0030061.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b32d/1848003/fb56f713f045/pcbi.0030061.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b32d/1848003/aad7f6abb980/pcbi.0030061.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b32d/1848003/6923032143ce/pcbi.0030061.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b32d/1848003/574e0caa3a3d/pcbi.0030061.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b32d/1848003/2248488ab92b/pcbi.0030061.g009.jpg

相似文献

1
DNA familial binding profiles made easy: comparison of various motif alignment and clustering strategies.轻松实现DNA家族结合图谱:多种基序比对和聚类策略的比较
PLoS Comput Biol. 2007 Mar 30;3(3):e61. doi: 10.1371/journal.pcbi.0030061. Epub 2007 Feb 15.
2
STAMP: a web tool for exploring DNA-binding motif similarities.STAMP:一个用于探索DNA结合基序相似性的网络工具。
Nucleic Acids Res. 2007 Jul;35(Web Server issue):W253-8. doi: 10.1093/nar/gkm272. Epub 2007 May 3.
3
Informative priors based on transcription factor structural class improve de novo motif discovery.基于转录因子结构类别的信息先验改进了从头基序发现。
Bioinformatics. 2006 Jul 15;22(14):e384-92. doi: 10.1093/bioinformatics/btl251.
4
Sequence features of DNA binding sites reveal structural class of associated transcription factor.DNA结合位点的序列特征揭示了相关转录因子的结构类别。
Bioinformatics. 2006 Jan 15;22(2):157-63. doi: 10.1093/bioinformatics/bti731. Epub 2005 Nov 2.
5
Inferring protein-DNA dependencies using motif alignments and mutual information.利用基序比对和互信息推断蛋白质与DNA的依赖性。
Bioinformatics. 2007 Jul 1;23(13):i297-304. doi: 10.1093/bioinformatics/btm215.
6
A parallel scheme for comparing transcription factor binding sites matrices.一种用于比较转录因子结合位点矩阵的并行方案。
J Bioinform Comput Biol. 2010 Jun;8(3):485-502. doi: 10.1142/s0219720010004689.
7
MATLIGN: a motif clustering, comparison and matching tool.MATLIGN:一种基序聚类、比较和匹配工具。
BMC Bioinformatics. 2007 Jun 8;8:189. doi: 10.1186/1471-2105-8-189.
8
Combining comparative genomics with de novo motif discovery to identify human transcription factor DNA-binding motifs.将比较基因组学与从头基序发现相结合以识别人类转录因子DNA结合基序。
BMC Bioinformatics. 2006 Dec 12;7 Suppl 4(Suppl 4):S21. doi: 10.1186/1471-2105-7-S4-S21.
9
A boosting approach for motif modeling using ChIP-chip data.一种使用芯片杂交(ChIP-chip)数据进行基序建模的增强方法。
Bioinformatics. 2005 Jun 1;21(11):2636-43. doi: 10.1093/bioinformatics/bti402. Epub 2005 Apr 7.
10
Meta-analysis discovery of tissue-specific DNA sequence motifs from mammalian gene expression data.从哺乳动物基因表达数据中进行荟萃分析发现组织特异性DNA序列基序
BMC Bioinformatics. 2006 Apr 27;7:229. doi: 10.1186/1471-2105-7-229.

引用本文的文献

1
Dynamic changes in P300 enhancers and enhancer-promoter contacts control mouse cardiomyocyte maturation.P300 增强子和增强子-启动子接触的动态变化控制小鼠心肌细胞的成熟。
Dev Cell. 2023 May 22;58(10):898-914.e7. doi: 10.1016/j.devcel.2023.03.020. Epub 2023 Apr 17.
2
GATA4 Regulates Developing Endocardium Through Interaction With ETS1.GATA4 通过与 ETS1 的相互作用调节发育中的心内膜。
Circ Res. 2022 Nov 11;131(11):e152-e168. doi: 10.1161/CIRCRESAHA.120.318102. Epub 2022 Oct 20.
3
Dynamic transcriptome analysis reveals signatures of paradoxical effect of vemurafenib on human dermal fibroblasts.

本文引用的文献

1
A cluster separation measure.一种聚类分离度量。
IEEE Trans Pattern Anal Mach Intell. 1979 Feb;1(2):224-7.
2
Structure of a complex of tandem HMG boxes and DNA.串联HMG盒与DNA复合物的结构
J Mol Biol. 2006 Jun 30;360(1):90-104. doi: 10.1016/j.jmb.2006.04.059. Epub 2006 May 12.
3
A hypothesis-based approach for identifying the binding specificity of regulatory proteins from chromatin immunoprecipitation data.一种基于假设的方法,用于从染色质免疫沉淀数据中识别调控蛋白的结合特异性。
动态转录组分析揭示了威罗菲尼对人真皮成纤维细胞产生矛盾作用的特征。
Cell Commun Signal. 2021 Dec 20;19(1):123. doi: 10.1186/s12964-021-00801-3.
4
JASPAR 2022: the 9th release of the open-access database of transcription factor binding profiles.JASPAR 2022:转录因子结合谱开放获取数据库的第 9 个版本。
Nucleic Acids Res. 2022 Jan 7;50(D1):D165-D173. doi: 10.1093/nar/gkab1113.
5
Sequential in mutagenesis in vivo reveals various functions for CTCF sites at the mouse cluster.体内诱变的顺序揭示了 CTCF 位点在小鼠簇中的各种功能。
Genes Dev. 2021 Nov 1;35(21-22):1490-1509. doi: 10.1101/gad.348934.121. Epub 2021 Oct 28.
6
Alternative Activation of Macrophages Is Accompanied by Chromatin Remodeling Associated with Lineage-Dependent DNA Shape Features Flanking PU.1 Motifs.巨噬细胞的替代激活伴随着与谱系相关的 DNA 形状特征相关的染色质重塑,这些特征侧翼结合 PU.1 基序。
J Immunol. 2020 Aug 15;205(4):1070-1083. doi: 10.4049/jimmunol.2000258. Epub 2020 Jul 13.
7
Sharing DNA-binding information across structurally similar proteins enables accurate specificity determination.在结构相似的蛋白质之间共享 DNA 结合信息可实现特异性的准确判断。
Nucleic Acids Res. 2020 Jan 24;48(2):e9. doi: 10.1093/nar/gkz1087.
8
A single ChIP-seq dataset is sufficient for comprehensive analysis of motifs co-occurrence with MCOT package.单个 ChIP-seq 数据集足以使用 MCOT 包全面分析与 MOTF 共现的情况。
Nucleic Acids Res. 2019 Dec 2;47(21):e139. doi: 10.1093/nar/gkz800.
9
A reference map of murine cardiac transcription factor chromatin occupancy identifies dynamic and conserved enhancers.鼠心脏转录因子染色质占有率参考图谱鉴定出动态和保守的增强子。
Nat Commun. 2019 Oct 28;10(1):4907. doi: 10.1038/s41467-019-12812-3.
10
Effects of Xiaoyaosan on the Hippocampal Gene Expression Profile in Rats Subjected to Chronic Immobilization Stress.逍遥散对慢性束缚应激大鼠海马基因表达谱的影响。
Front Psychiatry. 2019 Apr 12;10:178. doi: 10.3389/fpsyt.2019.00178. eCollection 2019.
Bioinformatics. 2006 Feb 15;22(4):423-9. doi: 10.1093/bioinformatics/bti815. Epub 2005 Dec 6.
4
Identifying the conserved network of cis-regulatory sites of a eukaryotic genome.识别真核生物基因组顺式调控位点的保守网络。
Proc Natl Acad Sci U S A. 2005 Nov 29;102(48):17400-5. doi: 10.1073/pnas.0505147102. Epub 2005 Nov 21.
5
Sequence features of DNA binding sites reveal structural class of associated transcription factor.DNA结合位点的序列特征揭示了相关转录因子的结构类别。
Bioinformatics. 2006 Jan 15;22(2):157-63. doi: 10.1093/bioinformatics/bti731. Epub 2005 Nov 2.
6
Measuring similarities between transcription factor binding sites.测量转录因子结合位点之间的相似性。
BMC Bioinformatics. 2005 Sep 28;6:237. doi: 10.1186/1471-2105-6-237.
7
T-Reg Comparator: an analysis tool for the comparison of position weight matrices.调节性T细胞比较器:一种用于比较位置权重矩阵的分析工具。
Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W438-41. doi: 10.1093/nar/gki590.
8
Improved detection of DNA motifs using a self-organized clustering of familial binding profiles.利用家族性结合谱的自组织聚类改进DNA基序检测。
Bioinformatics. 2005 Jun;21 Suppl 1:i283-91. doi: 10.1093/bioinformatics/bti1025.
9
Footer: a quantitative comparative genomics method for efficient recognition of cis-regulatory elements.页脚:一种用于高效识别顺式调控元件的定量比较基因组学方法。
Genome Res. 2005 Jun;15(6):840-7. doi: 10.1101/gr.2952005.
10
MatInspector and beyond: promoter analysis based on transcription factor binding sites.MatInspector及其他:基于转录因子结合位点的启动子分析
Bioinformatics. 2005 Jul 1;21(13):2933-42. doi: 10.1093/bioinformatics/bti473. Epub 2005 Apr 28.