Egozcue Juan José, Graffelman Jan, Ortego M Isabel, Pawlowsky-Glahn Vera
Department of Civil and Environmental Engineering, Universitat Politecnica de Catalunya, Barcelona, 08034 Spain.
Department of Statistics and Operations Research, Universitat Politecnica de Catalunya, Barcelona, 08034 Spain.
NAR Genom Bioinform. 2020 Nov 20;2(4):lqaa094. doi: 10.1093/nargab/lqaa094. eCollection 2020 Dec.
Measurements in sequencing studies are mostly based on counts. There is a lack of theoretical developments for the analysis and modelling of this type of data. Some thoughts in this direction are presented, which might serve as a seed. The main issues addressed are the compositional character of multinomial probabilities and the corresponding representation in orthogonal (isometric) coordinates, and modelling distributions for sequencing data taking into account possible effects of amplification techniques.
测序研究中的测量大多基于计数。对于这类数据的分析和建模,缺乏理论进展。本文提出了一些在这个方向上的想法,可作为一个开端。所解决的主要问题是多项概率的构成特征及其在正交(等距)坐标中的相应表示,以及在考虑扩增技术可能影响的情况下对测序数据的分布进行建模。