Margolis J
Science. 1967 Mar 10;155(3767):1213-9. doi: 10.1126/science.155.3767.1213.
Evaluation by means of citation patterns can be successful only insofar as published papers and their bibliographies reflect scientific activity and nothing else. Such an innocent descrip tion is becoming less and less tenable. The present scientific explosion gave rise to more than a proportional pub lication explosion, which not only re flects the scientific explosion but has its own dynamics and vicious circles. Publication of results is probably the main means of accomplishing the al most impossible task of accounting for time and money spent on research. Inevitably, this puts a premium on quantity at the expense of quality, and, as with any other type of inflation, the problem worsens: the more papers are written, the less they count for and the greater is the pressure to publish more. What makes matters worse is the fact that the sheer volume of the"litera ture" makes it increasingly difficult to separate what is worthwhile from the rest. Critical reviews have become somewhat of a rarity, and editorial judgment is usually relegated to ref erees, who are contemporaries and, per haps, competitors of the authors-a situation which has its own undesirable implications (11, 18). It requires little imagination to discover other vicious circles, all arising from distortion of the primary reasons for publishing the results of scientific inquiry. There are, it is true, signs of ad justment to this crisis, partly due to some easing of the pressure to pub lish at all costs, and partly due to the readers' changing attitudes toward the flood of publications. An increasing amount of research is now being car ried out in the form of collective proj ects in large institutions where publica tion is no longer the standard method of accounting for individual work. At the same time there is apparent an in creasing tendency for scientific journals to polarize into the relatively few leading ones which carry important informa tion and the many subsidiary journals which serve as vehicles for interim lo cal accounting and, in a way, sub stitute for detailed intradepartmental re ports. This division is a result not of some arbitrary decree but of normal competition between journals, as a re sult of which, however, the strong usual ly get stronger and the weak get weaker. Were it not for these changes and also for a striking improvement in abstracting, indexing, and alerting serv ices, most research workers would have found long ago that, even in their own specialized fields, new information is accumulating faster than it can be sorted out. These developments can pro vide only a temporary reprieve, so long as there remains a strong incentive to publish the greatest possible number of papers. A new scale of values based on citations is by no means infallible or, in many cases, even fair, but at least it provides an alternative to the existing one, which is at the root of the crisis. It might, of course, be asked whether wide acceptance of such new stand ards would not lead to deliberate abuses. A little reflection shows that the system is less open to manipula tion than might appear. First, the ref erees are expected to see to it that the submitted papers cite work which is pertinent to the subject. An increased awareness of the usefulness of citation indexing as a tool for retrieval and evaluation will make this aspect of refereeing more important, and what now passes for minor carelessness or discourtesy could easily come to be regarded as serious malpractice. Sec ond, as noted above, careful selection of references is in the author's own interest, because it helps him to reach his readers. There is, therefore, some room for hope that healthy feedback in the system will tend to keep it viable. At the basis of this hope lies the supposition that, in the long run, only good work can ensure recognition. As Martyn (2) has pointed out, as an information-retrieval method, cita tion indexing is rather "noisy." The word noisy may apply even more to the problem of evaluation. Whereas in information retrieval much of the unwanted information can be filtered out by suitable search strategy (2, 6), this is not so easy to do for the pur pose of evaluation, because a simple descendence relationship between papers is still an ideal far removed from actuality (7). The situation would be much better if we could at will exclude all citations which do not indi cate real indebtedness. A scheme of citation relationship indicators, first men tioned by Garfield (12) and elaborated by Lipetz (17), would be a help, but, even if it were technically feasible, to provide such indicators would greatly add to the production costs of the Index. Another possible way to minimize the effects of "noise" is to increase the size of the samples on which the reckon ing is based. Now that research has be come a rather popular occupation, it seems that a kind of public vote may have to be accepted as a factor in evaluation. Since this is the case, there is something to be said for extending the "franchise" to minimize acciden tal effects. An index which attempted to process all scientific publications would be several times the size of the present Index, and, what is more, it would not necessarily be an improve ment as a tool for information retrieval because most of the significant work is already concentrated in the present Index. Whether this attempt will ever be considered worthwhile remains pri marily a matter of policy and eco nomics. In the meantime there is an urgent need for more experience with the existing services. It is not the purpose of this article to advocate evaluation of scientific work by some kind of public opinion poll; its purpose is to recognize a pos sible trend in this direction. Any judg ment by public acclaim is subject to obvious fallacies, but we must not be carried away by the analogy to the Stock Exchange or to electoral prac tices. The fact that, in this case, the "public" consists of authors whose con tributions are generally linked creates quite a new pattern of organization. In this discussion some of the aspects of this pattern have been explored through analogy to idealized genetic or mechani cal network models, but the very uniqueness of the system, with its many self-organizing ramifications, makes it a new field which deserves close study, since these developments may have pro found effects on the future of scientific communication.
只有当已发表的论文及其参考文献能够反映科学活动且别无其他时,通过引用模式进行评估才可能成功。这样一种单纯的描述正变得越来越站不住脚。当前的科学爆炸引发了超过比例的出版爆炸,这不仅反映了科学爆炸,还具有自身的动态和恶性循环。研究成果的发表可能是完成记录研究中所花费时间和金钱这一几乎不可能任务的主要手段。不可避免地,这使得数量受到重视而质量受损,并且,如同任何其他类型的通货膨胀一样,问题会恶化:撰写的论文越多,它们的价值越低,发表更多论文的压力就越大。更糟糕的是,“文献”的绝对数量使得越来越难以从其余部分中区分出有价值的内容。批判性综述已变得有些罕见,编辑判断通常交给审稿人,而审稿人是作者的同代人,甚至可能是竞争对手——这种情况有其自身不良的影响(11, 18)。不难想象还存在其他恶性循环,所有这些都源于对发表科学探究成果的主要原因的扭曲。诚然,有迹象表明正在对这场危机进行调整,部分原因是不惜一切代价发表论文的压力有所缓解,部分原因是读者对大量出版物的态度发生了变化。现在越来越多的研究是以大型机构中的集体项目形式进行的,在这些机构中,发表不再是记录个人工作的标准方法。与此同时,科学期刊明显呈现出一种两极分化的趋势,少数领先期刊承载重要信息,而众多附属期刊则充当临时局部记录的载体,在某种程度上替代了详细的部门内报告。这种划分不是某种任意规定的结果,而是期刊之间正常竞争的结果,然而,其结果通常是强者更强,弱者更弱。若不是这些变化以及文摘、索引和警报服务有了显著改进,大多数研究人员早就会发现,即使在他们自己的专业领域,新信息积累的速度也超过了整理的速度。只要仍然存在尽可能多发表论文的强烈动机,这些发展只能提供暂时的缓解。基于引用的新价值尺度绝不是万无一失的,在许多情况下甚至也不公平,但至少它为现有的、处于危机根源的价值尺度提供了一种替代。当然,可能会有人问,这种新准则的广泛接受是否不会导致蓄意滥用。稍加思考就会发现,该系统比看起来更不容易被操纵。首先,审稿人应确保提交的论文引用与主题相关的作品。对引用索引作为检索和评估工具的有用性的认识提高将使审稿的这一方面变得更加重要,现在被视为轻微疏忽或无礼的行为很容易被视为严重的不当行为。其次,如上所述,精心挑选参考文献符合作者自身利益,因为这有助于他接触到读者。因此,有理由希望系统中的健康反馈将倾向于使其保持可行。这种希望基于这样一种假设,即从长远来看,只有优秀的工作才能确保得到认可。正如马丁(2)所指出的,作为一种信息检索方法,引用索引相当“嘈杂”。“嘈杂”这个词甚至更适用于评估问题。在信息检索中,许多不需要的信息可以通过合适的搜索策略过滤掉(2, 6),但对于评估目的来说,要做到这一点并不那么容易,因为论文之间简单的引用关系仍然是一个远未实现的理想(7)。如果我们能够随意排除所有不表明实际引用关系的引用,情况会好得多。加菲尔德(12)首先提到并由利佩茨(17)阐述的引用关系指标方案会有所帮助,但即使在技术上可行,提供这样的指标也会大大增加索引的生产成本。另一种尽量减少“噪声”影响的可能方法是增加计算所基于的样本量。既然研究已成为一种相当热门的职业,似乎不得不接受某种公众投票作为评估中的一个因素。既然如此,为尽量减少偶然影响而扩大“选举权”是有道理的。一个试图处理所有科学出版物的索引将是当前索引规模的数倍,而且,作为信息检索工具,它不一定会有所改进,因为大多数重要工作已经集中在当前索引中。这种尝试是否会被认为值得主要仍然是一个政策和经济问题。与此同时,迫切需要对现有服务有更多经验。本文的目的不是主张通过某种民意调查来评估科学工作;其目的是认识到这一方向上可能的趋势。任何公众赞誉的判断都容易出现明显的谬误,但我们不能因与股票交易所或选举做法的类比而误入歧途。在这种情况下,“公众”由其贡献通常相互关联的作者组成,这一事实创造了一种全新的组织模式。在这次讨论中,通过与理想化的遗传或机械网络模型进行类比,探讨了这种模式的一些方面,但该系统的独特性及其众多自组织分支使其成为一个值得深入研究的新领域,因为这些发展可能对科学交流的未来产生深远影响。