Department of Bio and Brain Engineering, KAIST, Daejeon, South Korea.
BMC Bioinformatics. 2010 Apr 16;11 Suppl 2(Suppl 2):S4. doi: 10.1186/1471-2105-11-S2-S4.
Diagnosis techniques using urine are non-invasive, inexpensive, and easy to perform in clinical settings. The metabolites in urine, as the end products of cellular processes, are closely linked to phenotypes. Therefore, urine metabolome is very useful in marker discoveries and clinical applications. However, only univariate methods have been used in classification studies using urine metabolome. Since multiple genes or proteins would be involved in developments of complex diseases such as breast cancer, multiple compounds including metabolites would be related with the complex diseases, and multivariate methods would be needed to identify those multiple metabolite markers. Moreover, because combinatorial effects among the markers can seriously affect disease developments and there also exist individual differences in genetic makeup or heterogeneity in cancer progressions, single marker is not enough to identify cancers.
We proposed classification models using multivariate classification techniques and developed an analysis procedure for classification studies using metabolome data. Through this strategy, we identified five potential urinary biomarkers for breast cancer with high accuracy, among which the four biomarker candidates were not identifiable by only univariate methods. We also proposed potential diagnosis rules to help in clinical decision making. Besides, we showed that combinatorial effects among multiple biomarkers can enhance discriminative power for breast cancer.
In this study, we successfully showed that multivariate classifications are needed to precisely diagnose breast cancer. After further validation with independent cohorts and experimental confirmation, these marker candidates will likely lead to clinically applicable assays for earlier diagnoses of breast cancer.
利用尿液进行诊断的技术具有非侵入性、成本低且易于在临床环境中进行的特点。尿液中的代谢物作为细胞过程的终产物,与表型密切相关。因此,尿液代谢组在标志物发现和临床应用中非常有用。然而,在使用尿液代谢组进行分类研究中,仅使用了单变量方法。由于乳腺癌等复杂疾病的发展涉及多个基因或蛋白质,而多种化合物包括代谢物都与复杂疾病相关,因此需要使用多变量方法来识别这些多代谢物标志物。此外,由于标志物之间的组合效应可能严重影响疾病的发展,并且遗传构成或癌症进展的异质性也存在个体差异,因此单个标志物不足以识别癌症。
我们提出了使用多变量分类技术的分类模型,并开发了一种用于代谢组数据分析的分类研究分析程序。通过该策略,我们以高精度识别出了五个用于乳腺癌的潜在尿液生物标志物,其中四个生物标志物候选物无法仅通过单变量方法识别。我们还提出了潜在的诊断规则,以帮助进行临床决策。此外,我们还表明,多个标志物之间的组合效应可以增强对乳腺癌的区分能力。
在这项研究中,我们成功地表明,需要使用多变量分类来精确诊断乳腺癌。在使用独立队列进行进一步验证和实验确认后,这些候选标志物很可能会导致用于早期诊断乳腺癌的临床适用检测方法。