Department of Chemical Engineering, Indian Institute of Technology Bombay, Mumbai 400076, Maharashtra, India.
Department of Chemical Engineering, Indian Institute of Technology Bombay, Mumbai 400076, Maharashtra, India; Interdisciplinary Program in Climate Studies, Indian Institute of Technology Bombay, Mumbai 400076, Maharashtra, India.
Sci Total Environ. 2022 Sep 15;839:156294. doi: 10.1016/j.scitotenv.2022.156294. Epub 2022 May 27.
Source Apportionment (SA) techniques are widely used for identifying key sources of air pollution, thereby providing critical inputs for policy measures. Positive Matrix Factorisation (PMF) (Paatero and Tapper, 1994) is a widely used SA technique. PMF uses the speciated concentration data (X) collected over several days and factorises it into source contribution (G) and source profile (F) matrices, albeit under positivity constraint. Towards this end, it involves solving an optimisation problem where the elements of X are weighted by the inverse of the standard deviations of the corresponding errors introduced during the sampling and chemical analysis process. Thus, PMF implicitly assumes that the errors in different elements of the X matrix are uncorrelated. This assumption may not hold since the sampling, and chemical analysis steps deployed in any data-collection campaign will inevitably lead to correlated errors. While there are other existing Non-Negative Matrix Factorisation (NMF) methods in literature that can be potentially used for SA, these also make various restrictive assumptions about the error covariance structure. In this work, we propose a new method called Generalised Non-Negative Matrix Factorisation (GNMF) to fill this gap. In particular, the proposed method is able to incorporate any error covariance matrix without making any restrictive assumptions on its structure. Towards this end, we integrate the full error covariance matrix in the objective function to be minimised to obtain F and G matrices. We derive the corresponding update rules for obtaining these matrices iteratively. To ensure non-negativity, we extend the multiplicative and projected gradient-based ideas available in NMF literature to the proposed GNMF approach. The proposed method subsumes various NMF methods available in literature as special cases. The utility of the proposed approach is demonstrated by comparing its performance with other methods on an SA problem using a dataset derived from field measurements.
源解析(SA)技术被广泛用于识别空气污染的关键来源,从而为政策措施提供关键输入。正定矩阵因子分解(PMF)(Paatero 和 Tapper,1994)是一种广泛使用的 SA 技术。PMF 使用在几天内收集的特定浓度数据(X),并将其分解为源贡献(G)和源谱(F)矩阵,尽管存在正定性约束。为此,它涉及解决一个优化问题,其中 X 的元素由在采样和化学分析过程中引入的相应误差的标准偏差的倒数加权。因此,PMF 隐含地假设 X 矩阵中不同元素的误差是不相关的。由于在任何数据收集活动中部署的采样和化学分析步骤不可避免地会导致相关误差,因此这种假设可能不成立。虽然文献中还有其他现有的非负矩阵分解(NMF)方法可用于 SA,但这些方法也对误差协方差结构做出了各种限制性假设。在这项工作中,我们提出了一种称为广义非负矩阵分解(GNMF)的新方法来填补这一空白。特别是,所提出的方法能够在不对其结构做出任何限制性假设的情况下纳入任何误差协方差矩阵。为此,我们将完整的误差协方差矩阵集成到要最小化的目标函数中,以获得 F 和 G 矩阵。我们推导出相应的更新规则,以便通过迭代获得这些矩阵。为了确保非负性,我们将 NMF 文献中可用的乘法和基于投影梯度的思想扩展到所提出的 GNMF 方法中。所提出的方法作为特例包含了文献中各种可用的 NMF 方法。通过在使用来自现场测量的数据集的 SA 问题上比较其性能,证明了所提出方法的有效性。