School of Library and Information Science and Department of Pathology and Laboratory Medicine, University of Kentucky, 339 Lucille Little Building, Lexington, KY 40506-0224, USA.
J Med Libr Assoc. 2009 Oct;97(4):260-6. doi: 10.3163/1536-5050.97.4.009.
The efficacy of user-defined subject tagging and software-generated subject tagging for describing and organizing cancer blog contents was explored.
The Technorati search engine was used to search the blogosphere for cancer blog postings generated during a two-month period. Postings were mined for relevant subject concepts, and blogger-defined tags and Text Analysis Portal for Research (TAPoR) software-defined tags were generated for each message. Descriptive data were collected, and the blogger-defined tags were compared with software-generated tags. Three standard vocabularies (Opinion Templates, Basic Resource, and Medical Subject Headings [MeSH] Resource) were used to assign subject terms to the blogs, with results compared for efficacy in information retrieval.
Descriptive data showed that most of the studied cancer blogs (80%) contained fewer than 500 words each. The numbers of blogger-defined tags per posting (M = 4.49 per posting) were significantly smaller than the TAPoR keywords (M = 23.55 per posting). Both blogger-defined subject tags and software-generated subject tags were often overly broad or overly narrow in focus, producing less than effective search results for those seeking to extract information from cancer blogs.
Additional exploration into methods for systematically organizing cancer blog postings is necessary if blogs are to become stable and efficacious information resources for cancer patients, friends, families, or providers.
探索用户自定义主题标签和软件生成的主题标签在描述和组织癌症博客内容方面的效果。
使用 Technorati 搜索引擎在两个月的时间内搜索博客圈中生成的癌症博客文章。挖掘帖子中的相关主题概念,并为每条信息生成博主定义的标签和 Text Analysis Portal for Research (TAPoR) 软件定义的标签。收集描述性数据,并比较博主定义的标签和软件生成的标签。使用三个标准词汇表(意见模板、基本资源和医学主题词 [MeSH] 资源)为博客分配主题词,并比较检索效果。
描述性数据显示,研究中的大多数癌症博客(80%)每条包含的字数都少于 500 字。每条帖子的博主定义标签数量(M = 4.49 个标签/帖子)明显小于 TAPoR 关键字数量(M = 23.55 个标签/帖子)。博主定义的主题标签和软件生成的主题标签通常过于宽泛或过于狭窄,导致从癌症博客中提取信息的搜索结果不够有效。
如果要使博客成为癌症患者、朋友、家属或提供者稳定有效的信息资源,就有必要进一步探索系统组织癌症博客文章的方法。