I am trying to understand crc32 to generate the unique url for web page.
If we use the crc32, what is the maximum number of urls can be used so that we can avoid duplicates?
What could be the approximative string length to keep the checksum to be 2^32?
When I tried UUID for an url and convert the uuid bytes to base 64, I could reduce to 22 chars long. I wonder I can reduce still further.
Mostly I want to convert the url (maximum 1024 chars) to shorted id.
There is no such number as the “maximum number of urls can be used so that we can avoid duplicates” for CRC32.
The problem is that CRC32 can produce duplicates, and it’s not a function of how many values you throw at it, it’s a function of what those values look like.
So you might have a collision on the second url, if you’re unlucky.
You should not base your algorithm on producing a unique hash, instead produce a unique value for each url manually.