Rakedzon Tzipora, Segev Elad, Chapnik Noam, Yosef Roy, Baram-Tsabari Ayelet
Faculty of Education in Science and Technology, Technion- Israel Institute of Technology, Haifa, Israel.
Department of Humanities and Arts, Technion- Israel Institute of Technology, Haifa, Israel.
PLoS One. 2017 Aug 9;12(8):e0181742. doi: 10.1371/journal.pone.0181742. eCollection 2017.
Scientists are required to communicate science and research not only to other experts in the field, but also to scientists and experts from other fields, as well as to the public and policymakers. One fundamental suggestion when communicating with non-experts is to avoid professional jargon. However, because they are trained to speak with highly specialized language, avoiding jargon is difficult for scientists, and there is no standard to guide scientists in adjusting their messages. In this research project, we present the development and validation of the data produced by an up-to-date, scientist-friendly program for identifying jargon in popular written texts, based on a corpus of over 90 million words published in the BBC site during the years 2012-2015. The validation of results by the jargon identifier, the De-jargonizer, involved three mini studies: (1) comparison and correlation with existing frequency word lists in the literature; (2) a comparison with previous research on spoken language jargon use in TED transcripts of non-science lectures, TED transcripts of science lectures and transcripts of academic science lectures; and (3) a test of 5,000 pairs of published research abstracts and lay reader summaries describing the same article from the journals PLOS Computational Biology and PLOS Genetics. Validation procedures showed that the data classification of the De-jargonizer significantly correlates with existing frequency word lists, replicates similar jargon differences in previous studies on scientific versus general lectures, and identifies significant differences in jargon use between abstracts and lay summaries. As expected, more jargon was found in the academic abstracts than lay summaries; however, the percentage of jargon in the lay summaries exceeded the amount recommended for the public to understand the text. Thus, the De-jargonizer can help scientists identify problematic jargon when communicating science to non-experts, and be implemented by science communication instructors when evaluating the effectiveness and jargon use of participants in science communication workshops and programs.
科学家不仅需要与该领域的其他专家交流科学与研究,还需要与其他领域的科学家和专家,以及公众和政策制定者进行沟通。与非专家交流时的一个基本建议是避免使用专业术语。然而,由于科学家们接受的训练是使用高度专业化的语言,因此避免使用术语对他们来说很困难,而且没有标准来指导科学家调整他们要传达的信息。在这个研究项目中,我们展示了一个最新的、对科学家友好的程序所产生的数据的开发与验证,该程序用于识别通俗书面文本中的术语,其依据是2012年至2015年期间在BBC网站上发表的超过9000万字的语料库。术语识别工具“去术语化器”对结果的验证涉及三项小型研究:(1)与文献中现有的词频列表进行比较和关联;(2)与之前关于非科学讲座的TED演讲稿、科学讲座的TED演讲稿以及学术科学讲座的演讲稿中口语化术语使用情况的研究进行比较;(3)对来自《公共科学图书馆·计算生物学》和《公共科学图书馆·遗传学》杂志的5000对已发表的研究摘要和普通读者对同一篇文章的总结进行测试。验证程序表明,“去术语化器”的数据分类与现有的词频列表显著相关,复制了之前关于科学讲座与普通讲座的研究中类似的术语差异,并识别出摘要和普通总结在术语使用上的显著差异。正如预期的那样,学术摘要中发现的术语比普通总结中更多;然而,普通总结中的术语百分比超过了公众理解文本所需的推荐量。因此,“去术语化器”可以帮助科学家在向非专家传播科学时识别有问题的术语,并可供科学传播教师在评估科学传播工作坊和项目参与者的有效性和术语使用情况时使用。