Moscow Institute of Physics and Technology, Moscow, Russian Federation.
National Medical Research Center for Endocrinology, Moscow, Russian Federation.
F1000Res. 2021 Dec 8;10:1260. doi: 10.12688/f1000research.74846.2. eCollection 2021.
A Molecular Features Set (MFS), is a result of a vast diversity of bioinformatics pipelines. The lack of a "gold standard" for most experimental data modalities makes it difficult to provide valid estimation for a particular MFS's quality. Yet, this goal can partially be achieved by analyzing inner-sample Distance Matrices (DM) and their power to distinguish between phenotypes. The quality of a DM can be assessed by summarizing its power to quantify the differences of inner-phenotype and outer-phenotype distances. This estimation of the DM quality can be construed as a measure of the MFS's quality. Here we propose Hobotnica, an approach to estimate MFSs quality by their ability to stratify data, and assign them significance scores, that allow for collating various signatures and comparing their quality for contrasting groups.
分子特征集(MFS)是各种生物信息学管道的结果。由于大多数实验数据模式缺乏“黄金标准”,因此很难对特定 MFS 的质量进行有效估计。然而,通过分析内部样本距离矩阵(DM)及其区分表型的能力,可以部分实现这一目标。可以通过总结其量化内表型和外表型距离差异的能力来评估 DM 的质量。可以将 DM 质量的这种估计解释为 MFS 质量的度量。在这里,我们提出了 Hobotnica,一种通过其分层数据的能力来估计 MFS 质量的方法,并为其分配显著性分数,从而可以整理各种特征并比较它们对对比组的质量。