Burnside J, Craig P N, Guthrie G T
J Chem Inf Comput Sci. 1984 Feb;24(1):39-41. doi: 10.1021/ci00041a008.
An algorithm is described that has been designed to sort medium-size lists of chemical names, including common, generic, trivial, and systematic names and code numbers, into a logical sequence. It successfully sorted more than 99.5% of 3767 names in its first application. Minor revisions then resulted in more than 99.9% success with the same set of names. The algorithm generates an 80-character primary sort key (alphabetic characters only) and a 16-character secondary level sort key (alphanumeric characters). These sort keys are generated de novo from the name as needed and, thus, do not require increased permanent-storage costs. Sorting on the primary sort key (and secondary sort keys when identical primary keys exist) results in logical sequences of chemical names.
本文描述了一种算法,该算法旨在将中等规模的化学名称列表(包括通用名、学名、俗名、系统名和编码)按逻辑顺序排序。在首次应用中,它成功地对3767个名称中的99.5%以上进行了排序。随后的小修订使得对同一组名称的排序成功率超过了99.9%。该算法生成一个80个字符的主排序键(仅字母字符)和一个16个字符的二级排序键(字母数字字符)。这些排序键根据需要从名称中重新生成,因此不需要增加永久存储成本。根据主排序键(以及存在相同主键时的二级排序键)进行排序可得到化学名称的逻辑顺序。