University of Jinan, Jinan 250022, China.
Comput Intell Neurosci. 2022 Aug 29;2022:3459605. doi: 10.1155/2022/3459605. eCollection 2022.
In order to improve the effect of key information extraction from digital archives, a key information extraction algorithm for different types of digital archives is designed. Preprocess digital archive information, taking part of speech and marks as key information. Self-organizing feature mapping network is used to extract the key information features of digital archives, and the semantic similarity calculation results are obtained by combining the feature extraction results. Combine with mutual information collection, take that word with the highest mutual information value as the collection cent, traverse all keywords, and take the central word as the key information of digital archives to complete the extraction of key information. Experiments show that the recall rate of the algorithm ranges from 96% to 99%, the extraction accuracy of key information of digital archives is between 96 and 98%, and the average extraction time of key information of digital archives is 0.63 s. The practical application effect is good.
为了提高数字档案关键信息提取的效果,设计了一种针对不同类型数字档案的关键信息提取算法。对数字档案信息进行预处理,以词性和标记作为关键信息。采用自组织特征映射网络提取数字档案的关键信息特征,并结合特征提取结果得到语义相似性计算结果。结合互信息集合并取互信息值最高的词作为集核,遍历所有关键词,以中心词作为数字档案的关键信息,完成关键信息的提取。实验表明,该算法的召回率在 96%到 99%之间,数字档案关键信息的提取准确率在 96%到 98%之间,数字档案关键信息的平均提取时间为 0.63s。实际应用效果良好。