Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China; Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
Department of Systems Biology, Beckman Research Institute, City of Hope, Monrovia, CA, USA.
Biochim Biophys Acta Mol Basis Dis. 2020 Oct 1;1866(10):165847. doi: 10.1016/j.bbadis.2020.165847. Epub 2020 May 27.
Liquid biopsy refers to the sampling, screening, and detecting potential biomarkers in unique liquid samples for clinical use. Lung cancer is one of the most highly frequent cancer subtypes, which is hard to be early diagnosed and monitored by radiological and histopathological evaluation that are the most general and accurate methods. Circulating miRNA is a potential clinical examination index for tumor detection and monitoring tumorigenesis progression using liquid biopsy. However, recognizing and validating the unique clinical values of each candidate circulating miRNA is expensive and time consuming. In this study, we presented a novel computational approach for identifying significant circulating miRNAs that may be applied to early screening, diagnosis, and constant monitoring of lung cancer progression. This approach incorporated several machine learning algorithms and was applied on the expression profiles of circulating miRNAs on lung cancer patients and control samples. In brief, a powerful feature selection method, minimum redundancy maximum relevance, was adopted to evaluate the importance of all features, resulting in a feature list. Then, incremental feature selection incorporating random forest followed to extract key circulating miRNAs. At the same time, an efficient classifier with MCC 0.740 was built. Top five circulating miRNAs, including miR-92a, miR-140-5p, miR-331-3p, miR-223, miR-374a, were analyzed and confirmed that they participated in the pathogenesis of lung cancer, indicating their significant prognosis power in lung cancer.
液体活检是指在独特的液体样本中进行采样、筛选和检测潜在生物标志物,以供临床使用。肺癌是最常见的癌症亚型之一,很难通过影像学和组织病理学评估(最常用和最准确的方法)进行早期诊断和监测。循环 miRNA 是肿瘤检测和监测肿瘤发生进展的潜在临床检查指标,可通过液体活检进行。然而,识别和验证每个候选循环 miRNA 的独特临床价值既昂贵又耗时。在这项研究中,我们提出了一种新的计算方法,用于识别可能应用于早期筛查、诊断和持续监测肺癌进展的显著循环 miRNA。该方法整合了几种机器学习算法,并应用于肺癌患者和对照样本的循环 miRNA 表达谱。简而言之,采用了强大的特征选择方法——最小冗余最大相关性,以评估所有特征的重要性,从而得到一个特征列表。然后,采用包含随机森林的增量特征选择来提取关键的循环 miRNA。同时,构建了一个具有 MCC 0.740 的高效分类器。分析并确认前五个循环 miRNA(miR-92a、miR-140-5p、miR-331-3p、miR-223 和 miR-374a)参与了肺癌的发病机制,表明它们在肺癌中的预后能力显著。