The algorithm presented here can be used for the canonical numbering of all kind of discrete structures. The special application to chemistry is clear, as molecules are described with the aid of graph theory - which represents a discrete structure - since the middle of the 18'th century.
But looking at the increasing amount of information available about chemical structures, we need to provide powerfull and most of all, flexible algorithms for not having to change the concept just because another piece of information has to be added that describes another special feature of the structure. You could even think of applying this algorithm to molecules which are no longer described by the graph theoretically model of molecules with all its well known limitations but also to other (discrete) representations suitable for your purposes.
The canonical numbering of molecules is essential for accessing structural information in chemical databases. The advantage of this kind of numbering is the fact that if you want to compare two structures, you just need a linear amount of time, according to the discretisation points that represent your structure. So you need not try all combinations possible.
One step further beyond the canonical numbering is the possibility to apply a linear code, compareable to an individual name, to the molecule. I. e. once you got the canonical numbering of the molecule, it is possible to compute a unique value that describes this structure. Searching in a database would be much easier then, because you just have to look for this unique value, for example by using a hash access method, which can be done instantly, without comparing atom by atom. Efforts in this field were made by .
1: W. D. Ihlenfeld, J. Gasteiger: "Der Einsatz von Hashcodes zur Erkennung der strukturellen Aehnlichkeit
von Molekuelen", Software Entwicklung in der Chemie, Springer Verlag Berlin /Heidelberg 1988
© Chr. Benecke, Oct. 1995