Hash of unique value = unique hash?

In general, the answer is no. This is trivial to show: SHA-1 has 2^160 different outputs - 160 bits, but there are many more inputs that that (e.g. , there are 2^320 different 40-byte strings, and they can't all map to a unique output) Given a sufficient subset of values, the answer is maybe.It depends on the exact algorithm and the size of the subset: if the number of possible inputs is smaller than the number of possible outputs, then it is possible (but NOT guaranteed). When thinking about this, it may be helpful to keep the birthday paradox in mind: the probability of a collision does not increase linearly with the number of inputs.

In general, the answer is no. This is trivial to show: SHA-1 has 2^160 different outputs - 160 bits, but there are many more inputs that that (e.g. , there are 2^320 different 40-byte strings, and they can't all map to a unique output). Given a sufficient subset of values, the answer is maybe.

It depends on the exact algorithm and the size of the subset: if the number of possible inputs is smaller than the number of possible outputs, then it is possible (but NOT guaranteed). When thinking about this, it may be helpful to keep the birthday paradox in mind: the probability of a collision does not increase linearly with the number of inputs.

Thanks. So the only way to get unique values is to generate and scan through the DB to check if it exists (repeat if yes)? That's pretty much what I was trying to avoid doing here, but I guess it's the only way.

– Nebs May 4 '10 at 19:11 Unfortunately, there's no other way if you want to guarantee a unique value. This is also why you can't easily reverse a hash: I can tell you that the SHA-1 hash for "1" is "356a192b7913b04c54574d18c28d46e6395428ab", but there are many other values that will generate that hash. – Michael Madsen May 4 '10 at 19:14 I see.

The thing is I probably won't need to generate more than say 1000 unique values. Would it be safe to say that all the values will be unique in this case? – Nebs May 4 '10 at 19:16 1 No.

Even with just two values, there is still a 1 in 2^160 chance that they hash to the same value. That may seem infinitely small, but it's still bigger than 0, which means that, although it is likely that your 1000 values all hash to different values, it cannot be guaranteed without testing. If you know the possible values in advance, you can calculate the hash for all of them, put them in a list, and sort it by the hash - that will make it easy to find duplicates.

– Michael Madsen May 4 '10 at 19:23 You're right, thanks. I ended up going with uniqid as per RenderIn's suggestion. – Nebs May 4 '10 at 20:27.

There is a small possibility that two different values give the same hash. Although very small, it's not unlikely.

It depends on the hashing algorithm. But theoretically, unless the hash is exactly the same as the original string there is a potential for the hash to not be unique. A hash of a value is a condensed representation of the original value.By removing pieces of information to create the hash you are losing parts of what make it unique in the domain and therefore increasing the probability that the value will not be unique.

The only way to guarantee that it will be unique is to use the original value itself which defeats the purpose of hashing.

The only way to guarantee that it will be unique is to use the original value itself" - clearly not true! – Martin Smith May 4 '10 at 19:27 @Martin - Please explain what you mean, why is my statement untrue? – Andrew Hare May 4 '10 at 19:33 The hash would clearly not have to be the original string for uniqueness to be guaranteed.It could be the original string ROT13 ed for example and still be unique.

– Martin Smith May 4 '10 at 19:54 c.f. Perfect Hash Functions en.wikipedia.Org/wiki/Perfect_hash_function – Martin Smith May 4 '10 at 20:08.

There are 2^320 different 40-byte strings, and they can't all map to a unique output). Given a sufficient subset of values, the answer is maybe. It depends on the exact algorithm and the size of the subset: if the number of possible inputs is smaller than the number of possible outputs, then it is possible (but NOT guaranteed).

When thinking about this, it may be helpful to keep the birthday paradox in mind: the probability of a collision does not increase linearly with the number of inputs.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions