Department of Translational Hematology and Oncology Research, Lerner Research Institute, Cleveland, OH 44195, United States of America.
School of Medicine, Case Western Reserve University, Cleveland, OH 44195, United States of America.
Phys Med Biol. 2023 Aug 22;68(17):174001. doi: 10.1088/1361-6560/ace305.
Image texture features, such as those derived by Haralick, are a powerful metric for image classification and are used across fields including cancer research. Our aim is to demonstrate how analogous texture features can be derived for graphs and networks. We also aim to illustrate how these new metrics summarize graphs, may aid comparative graph studies, may help classify biological graphs, and might assist in detecting dysregulation in cancer.We generate the first analogies of image texture for graphs and networks. Co-occurrence matrices for graphs are generated by summing over all pairs of neighboring nodes in the graph. We generate metrics for fitness landscapes, gene co-expression and regulatory networks, and protein interaction networks. To assess metric sensitivity we varied discretization parameters and noise. To examine these metrics in the cancer context we compare metrics for both simulated and publicly available experimental gene expression and build random forest classifiers for cancer cell lineage.Our novel graph 'texture' features are shown to be informative of graph structure and node label distributions. The metrics are sensitive to discretization parameters and noise in node labels. We demonstrate that graph texture features vary across different biological graph topologies and node labelings. We show how our texture metrics can be used to classify cell line expression by lineage, demonstrating classifiers with 82% and 89% accuracy.New metrics provide opportunities for better comparative analyzes and new models for classification. Our texture features are novel second-order graph features for networks or graphs with ordered node labels. In the complex cancer informatics setting, evolutionary analyses and drug response prediction are two examples where new network science approaches like this may prove fruitful.
图像纹理特征,如哈拉利克(Haralick)等人提出的特征,是图像分类的有力指标,广泛应用于癌症研究等领域。我们的目的是展示如何为图和网络推导出类似的纹理特征。我们还旨在说明这些新指标如何总结图,可能有助于比较图研究,帮助对生物图进行分类,以及可能有助于检测癌症中的失调。我们为图和网络生成了图像纹理的第一个类比。通过对图中所有相邻节点对进行求和,生成图的共生矩阵。我们生成了适合度景观、基因共表达和调控网络以及蛋白质相互作用网络的指标。为了评估指标的敏感性,我们改变了离散化参数和噪声。为了在癌症背景下研究这些指标,我们比较了模拟和公开可用的实验基因表达的指标,并为癌症细胞谱系构建了随机森林分类器。我们的新图“纹理”特征被证明可以提供图结构和节点标签分布的信息。该指标对节点标签的离散化参数和噪声敏感。我们证明了图纹理特征在不同的生物图拓扑结构和节点标签中有所不同。我们展示了如何使用我们的纹理指标通过谱系对细胞系表达进行分类,证明了分类器的准确性分别为 82%和 89%。新指标为更好的比较分析和新的分类模型提供了机会。我们的纹理特征是具有有序节点标签的网络或图的新颖二阶图特征。在复杂的癌症信息学环境中,进化分析和药物反应预测是这种新的网络科学方法可能证明是有益的两个示例。