Yang Jack Y, Yang Mary Qu, Luo Zuojie, Ma Yan, Li Jianling, Deng Youping, Huang Xudong
Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA.
BMC Genomics. 2008;9 Suppl 1(Suppl 1):S23. doi: 10.1186/1471-2164-9-S1-S23.
The prognosis for many cancers could be improved dramatically if they could be detected while still at the microscopic disease stage. It follows from a comprehensive statistical analysis that a number of antigens such as hTERT, PCNA and Ki-67 can be considered as cancer markers, while another set of antigens such as P27KIP1 and FHIT are possible markers for normal tissue. Because more than one marker must be considered to obtain a classification of cancer or no cancer, and if cancer, to classify it as malignant, borderline, or benign, we must develop an intelligent decision system that can fullfill such an unmet medical need.
We have developed an intelligent decision system using machine learning techniques and markers to characterize tissue as cancerous, non-cancerous or borderline. The system incorporates learning techniques such as variants of support vector machines, neural networks, decision trees, self-organizing feature maps (SOFM) and recursive maximum contrast trees (RMCT). These variants and algorithms we have developed, tend to detect microscopic pathological changes based on features derived from gene expression levels and metabolic profiles. We have also used immunohistochemistry techniques to measure the gene expression profiles from a number of antigens such as cyclin E, P27KIP1, FHIT, Ki-67, PCNA, Bax, Bcl-2, P53, Fas, FasL and hTERT in several particular types of neuroendocrine tumors such as pheochromocytomas, paragangliomas, and the adrenocortical carcinomas (ACC), adenomas (ACA), and hyperplasia (ACH) involved with Cushing's syndrome. We provided statistical evidence that higher expression levels of hTERT, PCNA and Ki-67 etc. are associated with a higher risk that the tumors are malignant or borderline as opposed to benign. We also investigated whether higher expression levels of P27KIP1 and FHIT, etc., are associated with a decreased risk of adrenomedullary tumors. While no significant difference was found between cell-arrest antigens such as P27KIP1 for malignant, borderline, and benign tumors, there was a significant difference between expression levels of such antigens in normal adrenal medulla samples and in adrenomedullary tumors.
Our frame work focused on not only different classification schemes and feature selection algorithms, but also ensemble methods such as boosting and bagging in an effort to improve upon the accuracy of the individual classifiers. It is evident that when all sorts of machine learning and statistically learning techniques are combined appropriately into one integrated intelligent medical decision system, the prediction power can be enhanced significantly. This research has many potential applications; it might provide an alternative diagnostic tool and a better understanding of the mechanisms involved in malignant transformation as well as information that is useful for treatment planning and cancer prevention.
如果许多癌症在仍处于微观疾病阶段时就能被检测到,其预后将会得到显著改善。通过全面的统计分析可知,一些抗原如hTERT、PCNA和Ki-67可被视为癌症标志物,而另一组抗原如P27KIP1和FHIT则可能是正常组织的标志物。由于必须考虑多种标志物才能对癌症与否进行分类,并且如果是癌症,还要将其分类为恶性、临界或良性,因此我们必须开发一种智能决策系统来满足这一未得到满足的医疗需求。
我们利用机器学习技术和标志物开发了一种智能决策系统,用于将组织表征为癌性、非癌性或临界性。该系统纳入了诸如支持向量机变体、神经网络、决策树、自组织特征映射(SOFM)和递归最大对比度树(RMCT)等学习技术。我们开发的这些变体和算法倾向于根据基因表达水平和代谢谱衍生的特征来检测微观病理变化。我们还使用免疫组织化学技术测量了几种特定类型神经内分泌肿瘤(如嗜铬细胞瘤、副神经节瘤以及与库欣综合征相关的肾上腺皮质癌(ACC)、腺瘤(ACA)和增生(ACH))中多种抗原(如细胞周期蛋白E、P27KIP1、FHIT、Ki-67、PCNA、Bax、Bcl-2、P53、Fas、FasL和hTERT)的基因表达谱。我们提供了统计证据表明,hTERT、PCNA和Ki-67等较高的表达水平与肿瘤为恶性或临界性而非良性的较高风险相关。我们还研究了P27KIP1和FHIT等较高的表达水平是否与肾上腺髓质肿瘤风险降低相关。虽然在恶性、临界和良性肿瘤的细胞周期蛋白依赖性激酶抑制因子如P27KIP1之间未发现显著差异,但在正常肾上腺髓质样本和肾上腺髓质肿瘤中此类抗原的表达水平之间存在显著差异。
我们的框架不仅关注不同的分类方案和特征选择算法,还关注诸如提升和装袋等集成方法,以努力提高各个分类器的准确性。显然,如果将各种机器学习和统计学习技术适当地组合到一个集成的智能医疗决策系统中,预测能力可以显著增强。这项研究有许多潜在应用;它可能提供一种替代诊断工具,更好地理解恶性转化所涉及的机制以及对治疗规划和癌症预防有用的信息。