Cryptographic Hash Functions in Password Storage
In this blog post I will define a cryptographic hash function and discuss how it is widely used in password storage and account security.
A cryptographic hash function is a powerful tool that uses high level mathematics to process data, such as a password, and keep the input data secure while using the output for authentication. Interestingly, most common cryptographic hash functions are publicly available. Here is documentation for the SHA-1 algorithm, for example, which was once widely used in web security. Even with this detailed information spelling out exactly how our data is processed, it is still secure! It’s like someone handing us a treasure map but never being able to find the treasure.
Let’s begin with a breakdown of the term:
Cryptographic
Encryption is a common way to hide data while it is en route to its location. Because standard encryption is a two-way process, the output can always be reversed with a key which allows the intended reader to un-encrypt the message and see the original input. There are obvious security issues there if the message is meant to be secret, as the Germans found during WWII when Alan Turing went ahead and figured out their encryption techniques in order to read their original message inputs!
Hash
A hash converts variable size inputs into constant size outputs in a deterministic way — the output is always the same for identical inputs. Using hashes allows data retrieval to be fast and consistent. You know exactly how long it will take to retrieve your 256 bit hash output. A hash might be used to make sure a document hasn’t changed during a transfer, acting like a fingerprint for a specific data set.
Function
A function is merely a bit of code that can be run.
All together now!
Cryptographic hash functions take a variable sized input, process it using a one-way complex algorithm, and deliver a fixed sized output. They are integral in most password security systems on the web.
Requirements of a Cryptographic Hash Function:
1. Preimage resistance — You should not be able to find the input value given the output and algorithm. This is achieved with a one-way function. If this is confusing, consider the modulo function. Given an input (let’s say 10) and a function (% 2) our output is 0. (10 divided by 2 leaves 0 remainder). If someone gave you the output 0 and the function %2 you would never be able to determine that the input was 10. It could be any even number. It is also worth mentioning that after many of the processes of an algorithm, bits of data are discarded, so by the end you don’t even have access to the original building blocks.
2. Second preimage resistance — Given an input, you should not be able to find a second input that delivers the same output. This is taken even further with the avalanche effect, which says that even a minor change in input produces an entirely different output. See the altered capitalization below.
3. Collision resistance — This is the same concept as second preimage resistance but refers to an attacker who is choosing both inputs. You should not be able to find two inputs that deliver the same output.
Salt
Now that we have a way to encrypt our password, the final security piece is the salt. A salt is a randomly generated string that is added to the input value before being sent through the algorithm. Salting a hash input protects the input data from an attacker with a Rainbow Table. The rainbow table is a common attack approach using a table made up of many hashes of popular passwords, and common words and numbers.The added complexity of salting makes rainbow tables prohibitively energy and time consuming for attackers.
Evolving Cryptosecurity
While highly unlikely (“non-trivial” in the language of crypto professionals) it is never impossible to break a cryptographic hash function. Attackers are constantly at work to do so and reveal sensitive data, while security companies and others are at work building defenses against these attacks. A public competition was held by the National Institute of Standards and Technology for the development of the SHA-3 algorithm. It was completed in 2010 and became a recommended standard in 2015. SHA-3 is just one example of the many cryptographic hash functions out there being used. Crypto security is a constantly evolving field. Security methods must stay current with the improved ability of attackers and an ever increasing computational capacity.