Vrochidis Stefanos, Moumtzidou Anastasia, Gialampoukidis Ilias, Liparas Dimitris, Casamayor Gerard, Wanner Leo, Heise Nicolaus, Wagner Tilman, Bilous Andriy, Jamin Emmanuel, Simeonov Boyan, Alexiev Vladimir, Busch Reinhard, Arapakis Ioannis, Kompatsiaris Ioannis
Information Technologies Institute, Centre for Research and Technology Hellas, Thessaloniki, Greece.
High Performance Computing Centre, University of Stuttgart, Stuttgart, Germany.
Front Robot AI. 2018 Oct 29;5:123. doi: 10.3389/frobt.2018.00123. eCollection 2018.
Analysts and journalists face the problem of having to deal with very large, heterogeneous, and multilingual data volumes that need to be analyzed, understood, and aggregated. Automated and simplified editorial and authoring process could significantly reduce time, labor, and costs. Therefore, there is a need for unified access to multilingual and multicultural news story material, beyond the level of a nation, ensuring context-aware, spatiotemporal, and semantic interpretation, correlating also and summarizing the interpreted material into a coherent gist. In this paper, we present a platform integrating multimodal analytics techniques, which are able to support journalists in handling large streams of real-time and diverse information. Specifically, the platform automatically crawls and indexes multilingual and multimedia information from heterogeneous resources. Textual information is automatically summarized and can be translated (on demand) into the language of the journalist. High-level information is extracted from both textual and multimedia content for fast inspection using concept clouds. The textual and multimedia content is semantically integrated and indexed using a common representation, to be accessible through a web-based search engine. The evaluation of the proposed platform was performed by several groups of journalists revealing satisfaction from the user side.
分析人员和记者面临着必须处理大量、异构且多语言的数据量的问题,这些数据需要进行分析、理解和汇总。自动化和简化的编辑与创作过程可以显著减少时间、人力和成本。因此,需要超越国家层面统一访问多语言和多文化的新闻报道素材,确保上下文感知、时空和语义解释,同时将解释后的素材关联并汇总成连贯的要点。在本文中,我们提出了一个集成多模态分析技术的平台,该平台能够支持记者处理大量实时且多样的信息。具体而言,该平台自动从异构资源中抓取多语言和多媒体信息并建立索引。文本信息会自动进行汇总,并可(按需)翻译成记者所使用的语言。通过概念云从文本和多媒体内容中提取高级信息,以便快速查看。文本和多媒体内容使用通用表示进行语义集成和索引,可通过基于网络的搜索引擎进行访问。几组记者对所提出的平台进行了评估,结果显示用户方面感到满意。