Suppr超能文献

使用乙型肝炎病毒准种模式通过深度测序和机器学习预测肝细胞癌。

Using Quasispecies Patterns of Hepatitis B Virus to Predict Hepatocellular Carcinoma With Deep Sequencing and Machine Learning.

机构信息

Department of Laboratory Medicine, Shanghai Eastern Hepatobiliary Surgery Hospital, Shanghai, China.

ISTBI and Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai, China.

出版信息

J Infect Dis. 2021 Jun 4;223(11):1887-1896. doi: 10.1093/infdis/jiaa647.

Abstract

BACKGROUND

Hepatitis B virus (HBV) infection is one of the main leading causes of hepatocellular carcinoma (HCC) worldwide. However, it remains uncertain how the reverse-transcriptase (rt) gene contributes to HCC progression.

METHODS

We enrolled a total of 307 patients with chronic hepatitis B (CHB) and 237 with HBV-related HCC from 13 medical centers. Sequence features comprised multidimensional attributes of rt nucleic acid and rt/s amino acid sequences. Machine-learning models were used to establish HCC predictive algorithms. Model performances were tested in the training and independent validation cohorts using receiver operating characteristic curves and calibration plots.

RESULTS

A random forest (RF) model based on combined metrics (10 features) demonstrated the best predictive performances in both cross and independent validation (AUC, 0.96; accuracy, 0.90), irrespective of HBV genotypes and sequencing depth. Moreover, HCC risk scores for individuals obtained from the RF model (AUC, 0.966; 95% confidence interval, .922-.989) outperformed α-fetoprotein (0.713; .632-.784) in distinguishing between patients with HCC and those with CHB.

CONCLUSIONS

Our study provides evidence for the first time that HBV rt sequences contain vital HBV quasispecies features in predicting HCC. Integrating deep sequencing with feature extraction and machine-learning models benefits the longitudinal surveillance of CHB and HCC risk assessment.

摘要

背景

乙型肝炎病毒(HBV)感染是全球肝细胞癌(HCC)的主要致病原因之一。然而,HBV 逆转录酶(rt)基因如何促进 HCC 进展仍不确定。

方法

我们共纳入来自 13 家医疗中心的 307 例慢性乙型肝炎(CHB)患者和 237 例 HBV 相关 HCC 患者。序列特征包括 rt 核酸和 rt/s 氨基酸序列的多维属性。使用机器学习模型建立 HCC 预测算法。使用接收器操作特征曲线和校准图在训练和独立验证队列中测试模型性能。

结果

基于组合指标(10 个特征)的随机森林(RF)模型在交叉和独立验证中均表现出最佳的预测性能(AUC,0.96;准确性,0.90),与 HBV 基因型和测序深度无关。此外,从 RF 模型获得的个体 HCC 风险评分(AUC,0.966;95%置信区间,0.922-0.989)在区分 HCC 患者和 CHB 患者方面优于甲胎蛋白(0.713;0.632-0.784)。

结论

本研究首次提供了证据,证明 HBV rt 序列在预测 HCC 中包含重要的 HBV 准种特征。将深度测序与特征提取和机器学习模型相结合,有利于 CHB 的纵向监测和 HCC 风险评估。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验