使用断点和分类算法检测和识别顺式调控元件。

Detection and identification of cis-regulatory elements using change-point and classification algorithms.

机构信息

School of Mathematics and Statistics, The University of Melbourne, Melbourne, 3010, VIC, Australia.

School of Mathematics, Monash University, Melbourne, 3800, VIC, Australia.

出版信息

BMC Genomics. 2022 Jan 25;23(1):78. doi: 10.1186/s12864-021-08190-0.

Abstract

BACKGROUND

Transcriptional regulation is primarily mediated by the binding of factors to non-coding regions in DNA. Identification of these binding regions enhances understanding of tissue formation and potentially facilitates the development of gene therapies. However, successful identification of binding regions is made difficult by the lack of a universal biological code for their characterisation.

RESULTS

We extend an alignment-based method, changept, and identify clusters of biological significance, through ontology and de novo motif analysis. Further, we apply a Bayesian method to estimate and combine binary classifiers on the clusters we identify to produce a better performing composite.

CONCLUSIONS

The analysis we describe provides a computational method for identification of conserved binding sites in the human genome and facilitates an alternative interrogation of combinations of existing data sets with alignment data.

摘要

背景

转录调控主要通过因子与 DNA 中非编码区域的结合来介导。这些结合区域的鉴定增强了对组织形成的理解,并可能促进基因治疗的发展。然而,由于缺乏用于其特征描述的通用生物学代码,成功鉴定结合区域变得困难。

结果

我们通过本体论和从头 motif 分析扩展了基于比对的方法 changept,并识别出具有生物学意义的聚类。此外,我们应用贝叶斯方法对我们识别的聚类进行二进制分类器的估计和组合,以生成性能更好的组合。

结论

我们描述的分析提供了一种在人类基因组中识别保守结合位点的计算方法,并促进了与比对数据组合的现有数据集的替代查询。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2708/8790847/a89949baaaee/12864_2021_8190_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索