Suppr超能文献

基于随机森林和蒙特卡罗交叉验证的乙肝病毒和丙肝病毒诱导的肝细胞癌的网络分析

Network analysis of HBV‑ and HCV‑induced hepatocellular carcinoma based on Random Forest and Monte Carlo cross‑validation.

作者信息

Zhao Shan-Na, Liu Ling-Ling, Lv Zhi-Ping, Wang Xiao-Hua, Wang Cheng-Hong

机构信息

Department of Clinical Laboratory, Yantaishan Hospital, Yantai, Shandong 264008, P.R. China.

Department of Clinical Laboratory, Shandong Provincial Hospital Affiliated Shandong University, Jinan, Shandong 250021, P.R. China.

出版信息

Mol Med Rep. 2017 Sep;16(3):2411-2416. doi: 10.3892/mmr.2017.6861. Epub 2017 Jun 27.

Abstract

Hepatocellular carcinoma (HCC) is one of the leading causes of cancer‑associated mortality worldwide. Hepatitis B virus (HBV) and hepatitis C virus (HCV) are two common risk factors for HCC. The majority of patients with HCC present at an advanced stage and are refractory to therapy. It is important to identify a method for efficient diagnosis at early stage. In the present study gene expression profile data, generated from microarray data, were pretreated according to the annotation files. The genes were mapped to pathways of Ingenuity Pathways Analysis. Dysregulated pathways and dysregulated pathway pairs were identified and constructed into individual networks, and a main network was constructed from individual networks with several edges. Random Forest (RF) classification was introduced to calculate the area under the curve (AUC) value of this network. Subsequently, 50 runs of Monte Carlo cross‑validation were used to screen the optimal main network. The results indicated that a total of 4,929 genes were identified in the pathways and gene expression profile. By combining dysregulated pathways with Z<0.05 and dysregulated pathway pairs with Z<0.2, individual networks were constructed. The optimal main network with the highest AUC value was identified. In the HCV group, the network was identified with an AUC value of 0.98, including 41 pairs of pathways, and in the HBV group, the network was identified with an AUC value of 0.94, including eight pairs of pathways. In addition, four pairs were identified in both groups. In conclusion, the optimal networks of HCV and HBV groups were identified with the highest AUC values. The use of these networks is expected to assist in diagnosing patients effectively at an early stage.

摘要

肝细胞癌(HCC)是全球癌症相关死亡的主要原因之一。乙型肝炎病毒(HBV)和丙型肝炎病毒(HCV)是HCC的两个常见危险因素。大多数HCC患者就诊时已处于晚期,对治疗难治。确定一种早期高效诊断方法很重要。在本研究中,根据注释文件对从微阵列数据生成的基因表达谱数据进行了预处理。将这些基因映射到Ingenuity通路分析的通路中。识别失调的通路和失调的通路对,并构建成单独的网络,然后从具有多条边的单独网络构建一个主要网络。引入随机森林(RF)分类来计算该网络的曲线下面积(AUC)值。随后,使用50次蒙特卡洛交叉验证来筛选最佳主要网络。结果表明,在通路和基因表达谱中总共鉴定出4929个基因。通过将Z<0.05的失调通路与Z<0.2的失调通路对相结合,构建了单独的网络。确定了具有最高AUC值的最佳主要网络。在HCV组中,鉴定出的网络AUC值为0.98,包括41对通路;在HBV组中,鉴定出的网络AUC值为0.94,包括8对通路。此外,两组中都鉴定出4对。总之,确定了HCV和HBV组具有最高AUC值的最佳网络。预计使用这些网络将有助于在早期有效诊断患者。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验