Proper disclosure: I am by no means an expert in cryptography (Math) and the following is just an idea I had after reading the article of how crackers make mince out of our passwords.

## The Problem

Password hashes, strong as they may seem today, are becoming weaker every day. When I heard about numbers like 2 billion password hashes per second, I’m rather amazed. Technology has advanced. The thing is, today it may be 2 billion, but sooner than later it’ll be 10, 20, 100, or more. Sooner than we expect. Inspired a bit by the Bitcoin nonces I had this idea, which may or may not be good, I’m not here to judge. As said – I’m not an expert in cryptography.

I feel rather humble after reading that article. I know that cryptography is not engineering, but math. However with my experience in the computers field, it seems like sometimes you need a touch of engineering to adopt things to the real world. I bet in cryptography courses in university no one really teaches you how to implement password crackers using GPUs, or do they?

## Abstract

The problem today with password hashing is that we totally rely on the strength of the hash. With that logic, SHA2048 > SHA1024 > SHA512 > SHA256, you get the point. With GPU technology found today, we can either scale up with our hashing algorithm, or we can simply come with a method that’ll involve “more work”. I want to introduce a method which will help (possibly) to secure the storage of passwords.

Inspired by the Bitcoin proof of work, adding nonces every time to find new hashes, I thought about the following (abstract) algorithm to hash a password:

hash_password(plain_password, hashed_password, cpu_ticks_used) if (enough_cpu_ticks_used <= cpu_ticks_used) return hashed_password else return hash_password( plain_password, HASH(hashed_password + plain_password), cpu_ticks_used + cpu_ticks_used_for_hashing) // Run with: my_password = hash_password("P4ssw0rd", "", A_LARGE_ADJUSTED_NUMBER)

What I'm trying to do here, is basically run a password enough time through a hashing function, until it is satisfactory for us (in terms of CPU/GPU power used nowadays) in terms of work. So as CPU/GPU power is becoming stronger, **enough_cpu_ticks_used** should increase.

## Verifying Passwords

That’s the rather tricky part. As said before it’s just some food for thought. When verifying a password, obviously the exact same has to be done, except that if you have passwords of different *strength*, different **cpu_ticks_used**, for verifying you obviously have to invest enough time until you get to the password, which obviously increases the load on machines implementing this scheme.

## Problems

Probably many!!

Things I can’t answer:

- Is running SHA/MD5 over and over can weaken the final hash?
- If the client has to authenticate (calculate the hash), how deep of a hash does he need to calculate?
- If a stored password is “too weak” for the current day, you cannot rebuild the password, as you need the plain part, what do you do then?
- No idea, help me list more limitations, or perhaps disproof all of this?

In the end of the day it wouldn’t protect us from the most simple and stupid password breaches, such as dictionary attacks and such, what it does counter is the crackers’ ability to run gazillion of hashes per second on a compromised password database.

I don’t come with all the answers, this is just something I had in the back of my head for a while and wanted to share, feel free to bash me if it is a stupid idea.

Errr…. what about SHA-3?

https://en.wikipedia.org/wiki/SHA-3

On October 2, 2012, Keccak was selected as the winner of the NIST hash function competition. SHA-3 is not meant to replace SHA-2, as no significant attack on SHA-2 has been demonstrated. Because of the successful attacks on MD5, SHA-0 and theoretical attacks on SHA-1, NIST perceived a need for an alternative, dissimilar cryptographic hash, which became SHA-3.

First: What Tsafrir said.

Answers to your questions:

a. Is running SHA/MD5 over and over can weaken the final hash?

+

b. If the client has to authenticate (calculate the hash), how deep of a hash does he need to calculate?

Running hashing algorithm H over and over again usually means running algorithm H with more rounds. The algorithm you call “SHA” is basically a relatively simple algorithm executed X times. What you are essentially saying is: If I run algorithm H, X*Y times instead of just X times, does this mean the algorithm will be harder?

I don’t know of such results but it can be argued both ways. Keep in mind that MD5 and SHA are two very different animals: MD5 can be broken by searching through the whole space. A single round is very simple (https://en.wikipedia.org/wiki/MD5#Algorithm), what you call MD5 is pretty much 64 rounds of this operation, and it’s result is 128 bits of info. In addition, there are cryptanalytic results that can bring down the amount of computation required by orders of magnitude. As per your argument, why not run it with a variable number of rounds?

Off the top of my head, there might be some issue here. When the number of rounds is changed for Hash families (SHA 256/512) the constants in the computations are changed. There could be intrinsic relation.

Also, the answer depends on how you use the variation in the number of rounds.

Option 1: You don’t. In this model, trying to verify the hash requires an unknown number of rounds. Bad point: This takes the verifier more time (which can be a significant set back), and would not be consistent; some hashes would take longer to verify than others. Strong point: Attackers needs more resources. But how many more resources? You would need to define some number of rounds that is the max, and then the question is – how many more orders of magnitude does the max add to the search space? If its less than 2^32 you probably haven’t made it that much harder for the attacker. Consider the fact that an off-the-shelf GPU can crack most MD5s in a few minutes. Now, say you are . If a GPU cost $200 of the shelf (a really good one), you can easily afford a few million, and lets say you get 50% for being such a nice customer – a few hundred million dollars isn’t that much. Given the 1440 minutes in a day, you’ll have cracked the variable-round MD5 in a days.

Also, it can be argued you’re kind of violating Kerckhoff’s Principle – you’re trying to gain security through obscurity by not revealing how many rounds you’ve used.

Option 2: You include the number of rounds in the hash. This only makes it takes a few more hours to crack the algorithm, since the number of rounds is just a bit larger. Also, if you assert Moore’s law, this time is halved every 18 months.

SHA2 (SHA-256) is a different story. It’s 256 bits, 64 rounds and as you can see (https://en.wikipedia.org/wiki/SHA-2#Hash_function) is more complex. MD5 you can break in a few minutes. SHA-256, to the best of what is publicly known – cannot be broken even if you had the collective power of all CPUs on earth every created. And SHA-512 which has different constants and 80 rounds, but the same as 256 otherwise, would require more energy than is currently assessed to exist in the known universe.

c. If a stored password is “too weak” for the current day, you cannot rebuild the password, as you need the plain part, what do you do then?

What do you mean by “rebuild the password”? The plaintext is the input to the hashing algorithm – it is a constant value once it is given. You can alternatively run a few more rounds and depending on option 1 or 2, update that info.

d. No idea, help me list more limitations, or perhaps disproof all of this?

Continue to use SHA2, it’s fast and implemented in hardware in most Core i7s since around 2010. The best known attack is for 41 out of 64 known rounds. If you’re paranoid, use SHA-512 (or SHA3).

My understanding is that you are correct. Multiple applications of the same hash function can make the computation arbitrarily long. Your pseudo-code looks similar to how PBKDF2 is defined. This is an industry standard used in many real world applications.

http://en.wikipedia.org/wiki/PBKDF2

For example their U_2 uses U_1 as salt, like in your algorithm. Note that regular use just sets a constant number of iterations, which solves most of the problems you came up with.

The Ars Technica article is scary but note that the passwords there were hashed with a single application of MD5 and no salt, which is essentially the worst security (complexity) possible.

Hey guys, thanks for the replies. As said, I’m not a crypto expert, very far from it probably. I’m probably also not up to date with what’s out there.

Amir, the problem and (proposed solution) as I tried to present it, was not to gain security by obscurity, but by significantly increasing the amount of work required in order to brute force a password, compared to the available computational power the world has. As a matter of fact, probably it wouldn’t matter if you list next to each password how many rounds it took, as long as you dynamically maintain that number (and have it grow as technology advances).

The same arguments you’re raising about SHA-2 were said about MD5 back in the 1990s, it even(!!!) replaced the might DES on most unix systems. But where do we stand today?

I didn’t come up here with a solution though. There are many problems to be discovered. The more the discussion – the better.

Even SHA-224 is a long way from being brute-forced. A brute-force attack on a hash function with an adress space size of 2^2n (length of 2n) would roughly take 2^n hashing attempts. You mentioned 100 bilion (American, right?) a second. That would be 100*10^9 = (roughly) 100*2^30 = 2^37.

There are roughly 30 milion = 2^25 seconds a year. So we have 62 hashing attempts a year.

Thus it would take some 64,000 years at that rate to break SHA-1 (getting close). But it would take some some 10^15 years to break SHA-224.

Note: those calculations assume that there’s no attack more efficient than brute force.

And if that’s not good enough: add a few more bits. Every bit added is equivalent to doubling the number of hashing attempts in your schema. Even if adding rounds that way does increase the security factor as you expect it to.

How do you measure the ticks used? If you make the machine really busy, then does it mean the password becomes weaker?

bcrypt is pretty widely implemented and takes a cost metric that expodentially increases the cost.

http://en.wikipedia.org/wiki/Bcrypt

The trick is to actually know what to set the cost too, when to increase it and what to do with stale entries that have a lower cost.

Check out bcrypt.