Department of Biochemistry, University at Buffalo-State University of New York, 955 Main St. #5128, Buffalo, NY, 14203, USA.
Present Address: Program in Biomedical Sciences, Columbia University, New York, NY, 10032, USA.
BMC Genomics. 2023 Jan 13;24(1):17. doi: 10.1186/s12864-023-09123-9.
Transcriptional enhancers are essential for gene regulation, but how these regulatory elements are best defined remains a significant unresolved question. Traditional definitions rely on activity-based criteria such as reporter gene assays, while more recently, biochemical assays based on chromatin-level phenomena such as chromatin accessibility, histone modifications, and localized RNA transcription have gained prominence.
We examine here whether these two types of definitions, activity-based and chromatin-based, effectively identify the same sets of sequences. We find that, concerningly, the overlap between the two groups is strikingly limited. Few of the data sets we compared displayed statistically significant overlap, and even for those, the degree of overlap was typically small (below 40% of sequences). Moreover, a substantial batch effect was observed in which experiment set rather than experimental method was a primary driver of whether or not chromatin-defined enhancers showed a strong overlap with reporter gene-defined enhancers.
Our results raise important questions as to the appropriateness of both old and new enhancer definitions, and suggest that new approaches are required to reconcile the poor agreement among existing methods for defining enhancers.
转录增强子对于基因调控至关重要,但这些调控元件如何被最佳定义仍然是一个悬而未决的重大问题。传统的定义依赖于基于活性的标准,如报告基因检测,而最近,基于染色质水平现象的生化检测,如染色质可及性、组蛋白修饰和局部 RNA 转录,已经变得突出。
我们在这里研究了这两种定义,基于活性的和基于染色质的,是否有效地识别相同的序列集。令人担忧的是,我们比较的两组之间的重叠非常有限。我们比较的数据集中很少有显示出统计学上显著的重叠,即使对于那些数据,重叠程度通常也很小(低于 40%的序列)。此外,我们观察到一个显著的批次效应,即实验集而不是实验方法是染色质定义的增强子是否与报告基因定义的增强子有很强的重叠的主要驱动因素。
我们的结果对旧的和新的增强子定义的适当性提出了重要的问题,并表明需要新的方法来调和现有定义增强子的方法之间的不良一致性。