Disserting about colliding GUIDs and the Big Bang Theory...
Can you generate two identical GUIDs? Would the world end if two GUIDs collide? How long does it take to generate a duplicate GUID and would we be still here when the result is found?
This is going to be a bit technical, or nerdy I should admit.
<enter mode="nerd">
We had a “confrontation” in my team whether two same GUIDs may ever exist. A GUID is a 128-bit long number, represented as a 32-character string, used for identifying resources in a unique way, potentially across the entire globe!
The initial specification of the GUID format used the network card MAC address of the computer where the application is running for generating a truly unique GUID (MAC addresses are unique). This peculiarity was causing some privacy concerns, as a GUID could be tracked to the computer that generated it, thus allowing to locate the person who created it. There is a positive side of this story, though, as this privacy hole allowed to track the creator of the Melissa virus.
The version currently used for generating GUIDs replaces the MAC address with a pseudo-random number. Cryptanalysis of the GUID generator algorithm shows that, since the sequence of GUIDs is pseudo-random, given full knowledge of the internal state, it is possible to predict previous and subsequent values.
Because GUIDs generated from random numbers contain 6 fixed bits and 122 random bits, the total number of unique such GUIDs is 2 to the 122nd power, or 5.3 followed by 36 zeros. Assuming a modern CPU is able to generate a new GUID every nanosecond (a nanosecond is one billionth of a second), that is a billion GUIDs every second, it would still take 3.8E17 (that’s 3.8 followed by 17 zeros) times the age of the Universe to have the 100% assurance to generate a duplicate GUID. This is according to the principle of the “Birthday Paradox”, for which you need at least n + 1 elements in order to obtain a duplicate, with n being the max value of distinct elements that can be obtained. Suppose you execute this program on a billion (yup!) computers, it would take 168,600,000,000 years to complete the search for a duplicate GUID, statistically (obviously, if you’re luckier, you can find it earlier!).
Assuming Moore’s law holds, it would be a lot quicker not to run this program, wait several hundred years and run it on a computer that is billions of times faster. In fact, any program that takes longer to run than it takes CPU speeds to double (about 18 months) will complete sooner if you wait until the CPU speeds have increased and buy a new CPU before running it.
Personally, I think the Big Bang was caused when two GUIDs collided.
</enter>
A privacy hole allowed to track the creator of the Melissa virus.
If you really have the time and resources for trying the search for a duplicate GUID yourself, here it is a sample C# code that may help.
class Program
{
static void Main(string[] args)
{
GuidSearch search = new GuidSearch();
Stopwatch watch = Stopwatch.StartNew();
Guid duplicate = Guid.Empty;
try
{
duplicate = search.FindDuplicate();
}
catch (OutOfMemoryException)
{
Console.WriteLine("Sorry, the Universe has ended...");
}
finally
{
watch.Stop();
Console.WriteLine($"Duplicate GUID {duplicate} found after {watch.Elapsed.Days} days, {watch.Elapsed.Hours} hours, {watch.Elapsed.Minutes} minutes and {watch.Elapsed.Seconds} seconds!");
}
}
}
The Main program.
The implementation of the FindDuplicate method in the GuidSearch class runs on a single thread. I bet some optimization can be obtained using a parallel loop, but I wouldn't wait for it to complete...
public class GuidSearch
{
public Guid FindDuplicate()
{
var aLotOfGuids = new List<Guid>();
Guid newGuid = Guid.Empty;
bool found = false;
while (!found)
{
newGuid = Guid.NewGuid();
found = aLotOfGuids.Contains(newGuid);
if (!found)
{
aLotOfGuids.Add(newGuid);
}
};
return newGuid;
}
}
The GuidSearch class.
Project Name: GuidSearch