Electronic International Standard Serial Number (EISSN)
1467-8640
abstract
Noncryptographic hash functions have an immense number of important practical applications owing to their powerful search properties. However, those properties critically depend on good designs: Inappropriately chosen hash functions are a very common source of performance losses. On the other hand, hash functions are difficult to design: They are extremely nonlinear and counterintuitive, and relationships between the variables are often intricate and obscure. In this work, we demonstrate the utility of genetic programming (GP) and avalanche effect to automatically generate noncryptographic hashes that can compete with state-of-the-art hash functions. We describe the design and implementation of our system, called GP-hash, and its fitness function, based on avalanche properties. Also, we experimentally identify good terminal and function sets and parameters for this task, providing interesting information for future research in this topic. Using GP-hash, we were able to generate two different families of noncryptographic hashes. These hashes are able to compete with a selection of the most important functions of the hashing literature, most of them widely used in the industry and created by world-class hashing experts with years of experience.