Department of Statistics, Texas A&M University, College Station, Texas, USA.
Stat Med. 2023 Jun 15;42(13):2257-2273. doi: 10.1002/sim.9722. Epub 2023 Mar 31.
Accurate and efficient detection of ovarian cancer at early stages is critical to ensure proper treatments for patients. Among the first-line modalities investigated in studies of early diagnosis are features distilled from protein mass spectra. This method, however, considers only a specific subset of spectral responses and ignores the interplay among protein expression levels, which can also contain diagnostic information. We propose a new modality that automatically searches protein mass spectra for discriminatory features by considering the self-similar nature of the spectra. Self-similarity is assessed by taking a wavelet decomposition of protein mass spectra and estimating the rate of level-wise decay in the energies of the resulting wavelet coefficients. Level-wise energies are estimated in a robust manner using distance variance, and rates are estimated locally via a rolling window approach. This results in a collection of rates that can be used to characterize the interplay among proteins, which can be indicative of cancer presence. Discriminatory descriptors are then selected from these evolutionary rates and used as classifying features. The proposed wavelet-based features are used in conjunction with features proposed in the existing literature for early stage diagnosis of ovarian cancer using two datasets published by the American National Cancer Institute. Including the wavelet-based features from the new modality results in improvements in diagnostic performance for early-stage ovarian cancer detection. This demonstrates the ability of the proposed modality to characterize new ovarian cancer diagnostic information.
准确、高效地检测早期卵巢癌对于确保患者得到适当的治疗至关重要。在早期诊断研究中,研究人员首先考虑的方法之一是从蛋白质质谱中提取特征。然而,这种方法只考虑了光谱响应的特定子集,而忽略了蛋白质表达水平之间的相互作用,这些相互作用也可能包含诊断信息。我们提出了一种新的方法,通过考虑光谱的自相似性,自动搜索蛋白质质谱中的鉴别特征。自相似性通过对蛋白质质谱进行小波分解来评估,并估计所得小波系数能量的分级衰减率。使用距离方差以稳健的方式估计分级能量,并且通过滚动窗口方法在局部估计速率。这导致可以用于描述蛋白质之间相互作用的特征的集合,这可能表明存在癌症。然后从这些进化速率中选择鉴别描述符,并将其用作分类特征。使用美国国立癌症研究所发布的两个数据集,将基于小波的特征与现有文献中提出的特征结合起来,用于早期卵巢癌的诊断。将新模态的基于小波的特征包括在内,可提高早期卵巢癌检测的诊断性能。这证明了所提出的模态能够描述新的卵巢癌诊断信息的能力。