Do Thi T N, Block Ines, Burton Mark, Sørensen Kristina P, Larsen Martin J, Jylling Anne Marie Bak, Ejlertsen Bent, Lænkholm Anne-Vibeke, Tan Qihua, Kruse Torben A, Thomassen Mads
Department of Clinical Genetics, Odense University Hospital, Odense, Denmark.
Unit of Human Genetics, Department of Clinical Research, University of Southern Denmark, Odense, Denmark.
Breast Cancer Res. 2025 Jul 15;27(1):133. doi: 10.1186/s13058-025-02061-2.
Prognostic tools for determining patients with indolent breast cancers (BCs) are far from optimal, leading to extensive overtreatment. Several studies have demonstrated mRNAs, lncRNAs and miRNAs to have prognostic potential in BC. Because mRNAs, lncRNAs, and miRNAs capture distinct transcriptomic information, we hypothesized that combining them would improve classification performance.
Our pair-matched design study included fresh frozen primary tumor samples from 160 lymph node negative and systemically untreated BC patients of which 80 developed recurrence while 80 remained recurrence-free (mean follow-up of 20.9 years). We integrated three classes of RNA and subsequently performed classification using seven machine learning methods followed by a voting scheme.
Under the criteria of ≥ 90% sensitivity, individual classifications resulted in specificities ranging from 74-91% for the integrated dataset and 56-66%, 58-71% and 69-86% for mRNAs, lncRNAs and miRNAs individually. The specificity level for the multi-transcriptomic dataset was 85% after voting while it was 38%, 48% and 82% for mRNAs, lncRNAs and miRNAs, respectively. In the clinical setting, very high sensitivity may be requested. In the most stringent clinical setting with a sensitivity of 99%, the integrated dataset also outperformed the others with a specificity of 41% compared to 0%, 9% and 28% for mRNAs, lncRNAs and miRNAs, respectively.
Our results strongly suggest an improvement of prognostic power for classification using an integrated dataset compared to individual classes of RNA and thus encourage researches to opt for an integration of datasets rather than analyzing them separately.
用于确定惰性乳腺癌(BC)患者的预后工具远非理想,导致大量过度治疗。多项研究表明,mRNA、lncRNA和miRNA在BC中具有预后潜力。由于mRNA、lncRNA和miRNA捕获不同的转录组信息,我们假设将它们结合起来会提高分类性能。
我们的配对设计研究包括来自160例淋巴结阴性且未接受全身治疗的BC患者的新鲜冷冻原发性肿瘤样本,其中80例出现复发,80例无复发(平均随访20.9年)。我们整合了三类RNA,随后使用七种机器学习方法进行分类,然后采用投票方案。
在灵敏度≥90%的标准下,个体分类对整合数据集的特异性范围为74%-91%,对mRNA、lncRNA和miRNA个体的特异性分别为56%-66%、58%-71%和69%-86%。投票后,多转录组数据集的特异性水平为85%,而mRNA、lncRNA和miRNA的特异性分别为38%、48%和82%。在临床环境中,可能需要非常高的灵敏度。在灵敏度为99%的最严格临床环境中,整合数据集也优于其他数据集,特异性为41%,而mRNA、lncRNA和miRNA的特异性分别为0%、9%和28%。
我们的结果强烈表明,与单个RNA类别相比,使用整合数据集进行分类的预后能力有所提高,因此鼓励研究人员选择整合数据集而不是单独分析它们。