Suppr超能文献

基于基因组组成的深度学习预测 HPV 的致癌潜力。

Genome composition-based deep learning predicts oncogenic potential of HPVs.

机构信息

Department of Pharmacy, Linfen Central Hospital, Linfen, China.

The 4th Medical Center, People's Liberation Army (PLA) General Hospital, Beijing, China.

出版信息

Front Cell Infect Microbiol. 2024 Jul 22;14:1430424. doi: 10.3389/fcimb.2024.1430424. eCollection 2024.

Abstract

Human papillomaviruses (HPVs) account for more than 30% of cancer cases, with definite identification of the oncogenic role of viral and genes. However, the identification of high-risk HPV genotypes has largely relied on lagged biological exploration and clinical observation, with types unclassified and oncogenicity unknown for many HPVs. In the present study, we retrieved and cleaned HPV sequence records with high quality and analyzed their genomic compositional traits of dinucleotide (DNT) and DNT representation (DCR) to overview the distribution difference among various types of HPVs. Then, a deep learning model was built to predict the oncogenic potential of all HPVs based on and genes. Our results showed that the main three groups of Alpha, Beta, and Gamma HPVs were clearly separated between/among types in the DCR trait for either or coding sequence (CDS) and were clustered within the same group. Moreover, the DCR data of either or were learnable with a convolutional neural network (CNN) model. Either CNN classifier predicted accurately the oncogenicity label of high and low oncogenic HPVs. In summary, the compositional traits of HPV oncogenicity-related genes and were much different between the high and low oncogenic HPVs, and the compositional trait of the DCR-based deep learning classifier predicted the oncogenic phenotype accurately of HPVs. The trained predictor in this study will facilitate the identification of HPV oncogenicity, particularly for those HPVs without clear genotype or phenotype.

摘要

人乳头瘤病毒(HPV)占癌症病例的 30%以上,病毒和基因的致癌作用已得到明确确定。然而,高危 HPV 基因型的鉴定在很大程度上依赖于滞后的生物学探索和临床观察,许多 HPV 的类型尚未分类,其致癌性也未知。在本研究中,我们检索并清理了高质量的 HPV 序列记录,并分析了它们的二核苷酸(DNT)和 DNT 表示(DCR)的基因组组成特征,以概述各种 HPV 之间的分布差异。然后,我们构建了一个深度学习模型,基于和基因预测所有 HPV 的致癌潜力。我们的结果表明,在 DCR 特征中,主要的三组 Alpha、Beta 和 Gamma HPV 在/在类型之间明显分离,无论是在编码序列(CDS)还是在同一组内都进行了聚类。此外,DCR 数据无论是还是都可以用卷积神经网络(CNN)模型学习。无论是 CNN 分类器都能准确预测高低致癌 HPV 的致癌性标签。总之,HPV 致癌相关基因和的组成特征在高低致癌 HPV 之间有很大的不同,基于 DCR 的深度学习分类器的组成特征可以准确预测 HPV 的致癌表型。本研究中训练的预测器将有助于 HPV 致癌性的鉴定,特别是对于那些没有明确基因型或表型的 HPV。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ff5/11298479/f14c46cca3c1/fcimb-14-1430424-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验