BMC Bioinformatics. 2014;15 Suppl 12(Suppl 12):S8. doi: 10.1186/1471-2105-15-S12-S8. Epub 2014 Nov 6.
The 3D chromatogram generated by High Performance Liquid Chromatography-Diode Array Detector (HPLC-DAD) has been researched widely in the field of herbal medicine, grape wine, agriculture, petroleum and so on. Currently, most of the methods used for separating a 3D chromatogram need to know the compounds' number in advance, which could be impossible especially when the compounds are complex or white noise exist. New method which extracts compounds from 3D chromatogram directly is needed.
In this paper, a new separation model named parallel Independent Component Analysis constrained by Reference Curve (pICARC) was proposed to transform the separation problem to a multi-parameter optimization issue. It was not necessary to know the number of compounds in the optimization. In order to find all the solutions, an algorithm named multi-areas Genetic Algorithm (mGA) was proposed, where multiple areas of candidate solutions were constructed according to the fitness and distances among the chromosomes.
Simulations and experiments on a real life HPLC-DAD data set were used to demonstrate our method and its effectiveness. Through simulations, it can be seen that our method can separate 3D chromatogram to chromatogram peaks and spectra successfully even when they severely overlapped. It is also shown by the experiments that our method is effective to solve real HPLC-DAD data set.
Our method can separate 3D chromatogram successfully without knowing the compounds' number in advance, which is fast and effective.
高效液相色谱-二极管阵列检测器(HPLC-DAD)生成的三维色谱图在草药、葡萄酒、农业、石油等领域得到了广泛的研究。目前,大多数用于分离三维色谱图的方法都需要事先知道化合物的数量,这在化合物复杂或存在白噪声时可能是不可能的。需要一种新的从三维色谱图中直接提取化合物的方法。
本文提出了一种新的分离模型,称为参考曲线约束的并行独立成分分析(pICARC),将分离问题转化为多参数优化问题。在优化过程中不需要知道化合物的数量。为了找到所有的解,提出了一种称为多区域遗传算法(mGA)的算法,根据适应度和染色体之间的距离,在多个候选解区域进行构建。
通过对实际的 HPLC-DAD 数据集进行模拟和实验,验证了我们的方法及其有效性。通过模拟可以看出,即使在严重重叠的情况下,我们的方法也能成功地将三维色谱图分离为色谱峰和光谱。实验还表明,我们的方法对解决实际的 HPLC-DAD 数据集是有效的。
我们的方法可以在不知道化合物数量的情况下成功地分离三维色谱图,速度快、效果好。