Department of Communication, Michigan State University, East Lansing, MI 48824, USA.
Bob Schieffer College of Communication, Texas Christian University, Fort Worth, TX 76129, USA.
Int J Environ Res Public Health. 2022 Jun 1;19(11):6759. doi: 10.3390/ijerph19116759.
Despite the popularity and efficiency of dictionary-based sentiment analysis (DSA) for public health research, limited empirical evidence has been produced about the validity of DSA and potential harms to the validity of DSA. A random sample of a second-hand Ebola tweet dataset was used to evaluate the validity of DSA compared to the manual coding approach and examine the influences of textual features on the validity of DSA. The results revealed substantial inconsistency between DSA and the manual coding approach. The presence of certain textual features such as negation can partially account for the inconsistency between DSA and manual coding. The findings imply that scholars should be careful and critical about findings in disease-related public health research that use DSA. Certain textual features should be more carefully addressed in DSA.
尽管基于词典的情感分析(DSA)在公共卫生研究中很受欢迎且高效,但关于 DSA 的有效性和对 DSA 有效性的潜在危害的实证证据有限。使用二手埃博拉推文数据集的随机样本来评估 DSA 与手动编码方法的有效性,并研究文本特征对 DSA 有效性的影响。结果表明,DSA 与手动编码方法之间存在很大的不一致性。否定等某些文本特征的存在部分解释了 DSA 与手动编码之间的不一致性。这些发现意味着,学者们应该对使用 DSA 的与疾病相关的公共卫生研究中的发现持谨慎和批判的态度。在 DSA 中应更仔细地处理某些文本特征。