Suppr超能文献

一种基于信息熵的计算鉴定组蛋白赖氨酸丁酰化的方法。

An Information Entropy-Based Approach for Computationally Identifying Histone Lysine Butyrylation.

作者信息

Huang Guohua, Zheng Yang, Wu Yao-Qun, Han Guo-Sheng, Yu Zu-Guo

机构信息

Provincial Key Laboratory of Informational Service for Rural Area of Southwestern Hunan, Shaoyang University, Shaoyang, China.

Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education and Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Xiangtan, China.

出版信息

Front Genet. 2020 Feb 14;10:1325. doi: 10.3389/fgene.2019.01325. eCollection 2019.

Abstract

Butyrylation plays a crucial role in the cellular processes. Due to limit of techniques, it is a challenging task to identify histone butyrylation sites on a large scale. To fill the gap, we propose an approach based on information entropy and machine learning for computationally identifying histone butyrylation sites. The proposed method achieves 0.92 of area under the receiver operating characteristic (ROC) curve over the training set by 3-fold cross validation and 0.80 over the testing set by independent test. Feature analysis implies that amino acid residues in the down/upstream of butyrylation sites would exhibit specific sequence motif to a certain extent. Functional analysis suggests that histone butyrylation was most possibly associated with four pathways (systemic lupus erythematosus, alcoholism, viral carcinogenesis and transcriptional misregulation in cancer), was involved in binding with other molecules, processes of biosynthesis, assembly, arrangement or disassembly and was located in such complex as consists of DNA, RNA, protein, . The proposed method is useful to predict histone butyrylation sites. Analysis of feature and function improves understanding of histone butyrylation and increases knowledge of functions of butyrylated histones.

摘要

丁酰化在细胞过程中起着至关重要的作用。由于技术限制,大规模鉴定组蛋白丁酰化位点是一项具有挑战性的任务。为了填补这一空白,我们提出了一种基于信息熵和机器学习的方法来通过计算鉴定组蛋白丁酰化位点。所提出的方法在训练集上通过3折交叉验证在受试者工作特征(ROC)曲线下面积达到0.92,在测试集上通过独立测试达到0.80。特征分析表明,丁酰化位点上下游的氨基酸残基在一定程度上会表现出特定的序列基序。功能分析表明,组蛋白丁酰化最有可能与四种途径(系统性红斑狼疮、酒精中毒、病毒致癌作用和癌症中的转录失调)相关,参与与其他分子的结合、生物合成、组装、排列或拆卸过程,并且位于由DNA、RNA、蛋白质等组成的复合物中。所提出的方法对于预测组蛋白丁酰化位点是有用的。特征和功能分析有助于增进对组蛋白丁酰化的理解,并增加对丁酰化组蛋白功能的认识。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验