Yan Donghui, Wang Pei, Knudsen Beatrice S, Linden Michael, Randolph Timothy W
Biostatistics and Biomathematics Program Fred Hutchinson Cancer Research Center Seattle, WA 98109.
Ann Appl Stat. 2012 Sep;6(3):1280-1305. doi: 10.1214/12-aoas543.
Recent advances in tissue microarray technology have allowed immunohistochemistry to become a powerful medium-to-high throughput analysis tool, particularly for the validation of diagnostic and prognostic biomarkers. However, as study size grows, the manual evaluation of these assays becomes a prohibitive limitation; it vastly reduces throughput and greatly increases variability and expense. We propose an algorithm-Tissue Array Co-Occurrence Matrix Analysis (TACOMA)-for quantifying cellular phenotypes based on textural regularity summarized by local inter-pixel relationships. The algorithm can be easily trained for any staining pattern, is absent of sensitive tuning parameters and has the ability to report salient pixels in an image that contribute to its score. Pathologists' input via informative training patches is an important aspect of the algorithm that allows the training for any specific marker or cell type. With co-training, the error rate of TACOMA can be reduced substantially for a very small training sample (e.g., with size 30). We give theoretical insights into the success of co-training via thinning of the feature set in a high dimensional setting when there is "sufficient" redundancy among the features. TACOMA is flexible, transparent and provides a scoring process that can be evaluated with clarity and confidence. In a study based on an estrogen receptor (ER) marker, we show that TACOMA is comparable to, or outperforms, pathologists' performance in terms of accuracy and repeatability.
组织微阵列技术的最新进展使免疫组织化学成为一种强大的中高通量分析工具,尤其适用于诊断和预后生物标志物的验证。然而,随着研究规模的扩大,对这些检测进行人工评估成为一个令人望而却步的限制;它极大地降低了通量,大幅增加了变异性和成本。我们提出了一种算法——组织阵列共现矩阵分析(TACOMA),用于基于局部像素间关系总结的纹理规律性来量化细胞表型。该算法可以针对任何染色模式轻松训练,不存在敏感的调优参数,并且能够报告图像中对其得分有贡献的显著像素。病理学家通过信息丰富的训练补丁输入是该算法的一个重要方面,它允许针对任何特定标记或细胞类型进行训练。通过协同训练,对于非常小的训练样本(例如,大小为30),TACOMA的错误率可以大幅降低。当特征之间存在“足够”的冗余时,我们通过在高维环境中精简特征集,从理论上深入了解了协同训练的成功之处。TACOMA灵活、透明,并提供了一个可以清晰且自信地评估的评分过程。在一项基于雌激素受体(ER)标记的研究中,我们表明TACOMA在准确性和可重复性方面与病理学家的表现相当或更优。