Xu Haodong, Jia Johnathan, Jeong Hyun-Hwan, Zhao Zhongming
Center for Precision Health, School of Biomedical Informatics, UTHealth Science Center at Houston, Houston, TX 77030, USA.
MD Anderson UTHealth Graduate School of Biomedical Sciences, Houston, TX 77030, USA.
Patterns (N Y). 2023 Feb 10;4(2):100674. doi: 10.1016/j.patter.2022.100674.
Human T-cell leukemia virus type 1 (HTLV-1), a retrovirus, is the causative agent for adult T cell leukemia/lymphoma and many other human diseases. Accurate and high throughput detection of HTLV-1 virus integration sites (VISs) across the host genomes plays a crucial role in the prevention and treatment of HTLV-1-associated diseases. Here, we developed DeepHTLV, the first deep learning framework for VIS prediction from genome sequence, motif discovery, and -regulatory factor identification. We demonstrated the high accuracy of DeepHTLV with more efficient and interpretive feature representations. Decoding the informative features captured by DeepHTLV resulted in eight representative clusters with consensus motifs for potential HTLV-1 integration. Furthermore, DeepHTLV revealed interesting -regulatory elements in regulation of VISs that have significant association with the detected motifs. Literature evidence demonstrated nearly half (34) of the predicted transcription factors enriched with VISs were involved in HTLV-1-associated diseases. DeepHTLV is freely available at https://github.com/bsml320/DeepHTLV.
人类T细胞白血病病毒1型(HTLV-1)是一种逆转录病毒,是成人T细胞白血病/淋巴瘤及许多其他人类疾病的病原体。在宿主基因组中准确且高通量地检测HTLV-1病毒整合位点(VISs)在HTLV-1相关疾病的预防和治疗中起着关键作用。在此,我们开发了DeepHTLV,这是首个用于从基因组序列预测VISs、进行基序发现和调控因子识别的深度学习框架。我们通过更高效且可解释的特征表示证明了DeepHTLV的高精度。对DeepHTLV捕获的信息特征进行解码产生了八个具有潜在HTLV-1整合共有基序的代表性簇。此外,DeepHTLV揭示了在VISs调控中与检测到的基序有显著关联的有趣调控元件。文献证据表明,预测的富含VISs的转录因子中近一半(34个)与HTLV-1相关疾病有关。DeepHTLV可在https://github.com/bsml320/DeepHTLV上免费获取。