Ghosh Sohom, Yadav Shefali, Wang Xin, Chakrabarty Bibhash, Kadıoğlu Serdar
AI Center of Excellence, Fidelity Investments, Boston, MA, United States.
Department of Computer Science, Brown University, Providence, RI, United States.
Front Artif Intell. 2022 Jul 12;5:868085. doi: 10.3389/frai.2022.868085. eCollection 2022.
Sequential pattern mining remains a challenging task due to the large number of redundant candidate patterns and the exponential search space. In addition, further analysis is still required to map extracted patterns to different outcomes. In this paper, we introduce a pattern mining framework that operates on semi-structured datasets and exploits the dichotomy between outcomes. Our approach takes advantage of constraint reasoning to find sequential patterns that occur frequently and exhibit desired properties. This allows the creation of novel pattern embeddings that are useful for knowledge extraction and predictive modeling. Based on dichotomic pattern mining, we present two real-world applications for customer intent prediction and intrusion detection. Overall, our approach plays an integrator role between semi-structured sequential data and machine learning models, improves the performance of the downstream task, and retains interpretability.
由于存在大量冗余候选模式和指数级搜索空间,序列模式挖掘仍然是一项具有挑战性的任务。此外,仍需要进一步分析以将提取的模式映射到不同的结果。在本文中,我们介绍了一种模式挖掘框架,该框架在半结构化数据集上运行,并利用结果之间的二分法。我们的方法利用约束推理来查找频繁出现并具有所需属性的序列模式。这允许创建对知识提取和预测建模有用的新型模式嵌入。基于二分模式挖掘,我们提出了两个用于客户意图预测和入侵检测的实际应用。总体而言,我们的方法在半结构化序列数据和机器学习模型之间起到了整合作用,提高了下游任务的性能,并保持了可解释性。