Suppr超能文献

使用分布语义模型挖掘关联语言模式,对负面生活事件进行分类。

Mining association language patterns using a distributional semantic model for negative life event classification.

机构信息

Department of Information Management, Yuan Ze University, Chung-Li, Taiwan, ROC.

出版信息

J Biomed Inform. 2011 Aug;44(4):509-18. doi: 10.1016/j.jbi.2011.01.006. Epub 2011 Feb 1.

Abstract

PURPOSE

Negative life events, such as the death of a family member, an argument with a spouse or the loss of a job, play an important role in triggering depressive episodes. Therefore, it is worthwhile to develop psychiatric services that can automatically identify such events. This study describes the use of association language patterns, i.e., meaningful combinations of words (e.g., <loss, job>), as features to classify sentences with negative life events into predefined categories (e.g., Family, Love, Work).

METHODS

This study proposes a framework that combines a supervised data mining algorithm and an unsupervised distributional semantic model to discover association language patterns. The data mining algorithm, called association rule mining, was used to generate a set of seed patterns by incrementally associating frequently co-occurring words from a small corpus of sentences labeled with negative life events. The distributional semantic model was then used to discover more patterns similar to the seed patterns from a large, unlabeled web corpus.

RESULTS

The experimental results showed that association language patterns were significant features for negative life event classification. Additionally, the unsupervised distributional semantic model was not only able to improve the level of performance but also to reduce the reliance of the classification process on the availability of a large, labeled corpus.

摘要

目的

负性生活事件,如家庭成员的死亡、与配偶的争吵或失业,在引发抑郁发作中起着重要作用。因此,开发能够自动识别此类事件的精神科服务是值得的。本研究描述了使用关联语言模式,即有意义的单词组合(例如<loss, job>)作为特征,将包含负性生活事件的句子分类到预定义的类别(例如家庭、爱情、工作)中。

方法

本研究提出了一个框架,将有监督的数据挖掘算法和无监督的分布语义模型相结合,以发现关联语言模式。该数据挖掘算法称为关联规则挖掘,它通过从标记有负性生活事件的小句子语料库中逐步关联经常同时出现的单词,生成一组种子模式。然后,使用分布语义模型从大型未标记的网络语料库中发现更多类似于种子模式的模式。

结果

实验结果表明,关联语言模式是负性生活事件分类的显著特征。此外,无监督的分布语义模型不仅能够提高性能水平,而且还减少了对大量标记语料库可用性的依赖。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验