Jilin University of Architecture and Technology, Changchun, Jilin 130114, China.
Comput Intell Neurosci. 2022 May 14;2022:7559523. doi: 10.1155/2022/7559523. eCollection 2022.
With the rapid development of information technology, the amount of data in various digital archives has exploded. How to reasonably mine and analyze archive data and improve the effect of intelligent management of newly included archives has become an urgent problem to be solved. The existing archival data classification method is manual classification oriented to management needs. This manual classification method is inefficient and ignores the inherent content information of the archives. In addition, for the discovery and utilization of archive information, it is necessary to further explore and analyze the correlation between the contents of the archive data. Facing the needs of intelligent archive management, from the perspective of the text content of archive data, further analysis of manually classified archives is carried out. Therefore, this paper proposes an intelligent classification method for archive data based on multigranular semantics. First, it constructs a semantic-label multigranular attention model; that is, the output of the stacked expanded convolutional coding module and the label graph attention module are jointly connected to the multigranular attention Mechanism network, the weighted label output by the multigranularity attention mechanism network is used as the input of the fully connected layer, and the output value of the fully connected layer used to map the predicted label is input into a Sigmoid layer to obtain the predicted probability of each label; then, the model for training: use the multilabel data set to train the constructed semantic-label multigranularity attention model, adjust the parameters until the semantic-label multigranularity attention model converges, and obtain the trained semantic-label multigranularity attention model. Taking the multilabel data set to be classified as input, the semantic-label multigranularity attention model after training outputs the classification result.
随着信息技术的飞速发展,各种数字档案中的数据量呈爆炸式增长。如何合理挖掘和分析档案数据,提高新归档档案的智能管理效果,已成为亟待解决的问题。现有的档案数据分类方法是面向管理需求的人工分类。这种人工分类方法效率低下,忽略了档案的固有内容信息。此外,为了发现和利用档案信息,有必要进一步探索和分析档案数据内容之间的相关性。面对智能档案管理的需求,从档案数据的文本内容出发,对人工分类的档案进行进一步分析。因此,本文提出了一种基于多粒度语义的档案数据智能分类方法。首先,构建语义标签多粒度注意力模型;即堆叠扩展卷积编码模块和标签图注意力模块的输出共同连接到多粒度注意力机制网络,多粒度注意力机制网络加权标签输出作为全连接层的输入,全连接层的输出值用于映射预测标签的输入到 Sigmoid 层,得到每个标签的预测概率;然后,对模型进行训练:使用多标签数据集训练构建的语义标签多粒度注意力模型,调整参数直到语义标签多粒度注意力模型收敛,得到训练好的语义标签多粒度注意力模型。将待分类的多标签数据集作为输入,训练后的语义标签多粒度注意力模型输出分类结果。