Vercruysse Steven, Kuiper Martin
Systems Biology group, Department of Biology, Norwegian University of Science and Technology, Høgskoleringen 5, 7491 Trondheim, Norway.
BMC Res Notes. 2012 Oct 30;5:601. doi: 10.1186/1756-0500-5-601.
Ideally each Life Science article should get a 'structured digital abstract'. This is a structured summary of the paper's findings that is both human-verified and machine-readable. But articles can contain a large variety of information types and contextual details that all need to be reconciled with appropriate names, terms and identifiers, which poses a challenge to any curator. Current approaches mostly use tagging or limited entry-forms for semantic encoding.
We implemented a 'controlled language' as a more expressive representation method. We studied how usable this format was for wet-lab-biologists that volunteered as curators. We assessed some issues that arise with the usability of ontologies and other controlled vocabularies, for the encoding of structured information by 'untrained' curators. We take a user-oriented viewpoint, and make recommendations that may prove useful for creating a better curation environment: one that can engage a large community of volunteer curators.
Entering information in a biocuration environment could improve in expressiveness and user-friendliness, if curators would be enabled to use synonymous and polysemous terms literally, whereby each term stays linked to an identifier.
理想情况下,每篇生命科学文章都应有一个“结构化数字摘要”。这是对论文研究结果的结构化总结,既可供人工验证,也可供机器读取。但文章可能包含各种各样的信息类型和上下文细节,所有这些都需要与适当的名称、术语和标识符相协调,这给任何编辑人员都带来了挑战。当前的方法大多使用标签或有限的输入表单进行语义编码。
我们实施了一种“受控语言”作为一种更具表现力的表示方法。我们研究了这种格式对自愿担任编辑人员的湿实验室生物学家的可用性。我们评估了本体和其他受控词汇在可用性方面出现的一些问题,这些问题涉及“未经培训”的编辑人员对结构化信息的编码。我们从用户导向的角度出发,提出了一些建议,这些建议可能对创建一个更好的编辑环境有用:一个能够吸引大量志愿者编辑人员的环境。
如果编辑人员能够按字面意思使用同义词和多义词,同时每个术语都与一个标识符相关联,那么在生物编辑环境中输入信息的表达能力和用户友好性可能会得到提高。