Bellandi Valerio, Ceravolo Paolo, Maghool Samira, Siccardi Stefano
Department of Computer Science, Università degli Studi di Milano, Milan, Italy.
CINI-Consorzio Interuniversitario Nazionale per l'Informatica, Rome, Italy.
Big Data. 2022 Oct;10(5):408-424. doi: 10.1089/big.2021.0326. Epub 2022 Jun 6.
Multimodal Analytics in Big Data architectures implies compounded configurations of the data processing tasks. Each modality in data requires specific analytics that triggers specific data processing tasks. Scalability can be reached at the cost of an attentive calibration of the resources shared by the different tasks searching for a trade-off with the multiple requirements they impose. We propose a methodology to address multimodal analytics within the same data processing approach to get a simplified architecture that can fully exploit the potential of the parallel processing of Big Data infrastructures. Multiple data sources are first integrated into a unified knowledge graph (KG). Different modalities of data are addressed by specifying views on the KG and producing a rewriting of the graph containing merely the data to be processed. Graph traversal and rule extraction are this way boosted. Using graph embeddings methods, the different views can be transformed into low-dimensional representation following the same data format. This way a single machine learning procedure can address the different modalities, simplifying the architecture of our system. The experiments we executed demonstrate that our approach reduces the cost of execution and improves the accuracy of analytics.
大数据架构中的多模态分析意味着数据处理任务的复合配置。数据中的每种模态都需要特定的分析,从而触发特定的数据处理任务。可通过仔细校准不同任务共享的资源来实现可扩展性,以在它们提出的多个要求之间进行权衡。我们提出一种方法,在同一数据处理方法中处理多模态分析,以获得一个简化的架构,该架构可以充分利用大数据基础设施并行处理的潜力。多个数据源首先被集成到一个统一的知识图谱(KG)中。通过指定KG上的视图并生成仅包含要处理的数据的图谱重写来处理不同模态的数据。通过这种方式增强了图谱遍历和规则提取。使用图谱嵌入方法,可以将不同的视图转换为遵循相同数据格式的低维表示。这样,单个机器学习过程就可以处理不同的模态,简化了我们系统的架构。我们进行的实验表明,我们的方法降低了执行成本并提高了分析的准确性。