Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 4232252
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 21, 20262026-05-21T02:04:16+00:00 2026-05-21T02:04:16+00:00

Okay, the title is really very subjective. But thats just what the problem is

  • 0

Okay, the title is really very subjective. But thats just what the problem is to me.

The background is that I want to distribute hits of static web contents evenly about a defined number of caching servers. Also the delivery to clients should speed up because several domains are in use and requests are not blocking each other. I also don’t need a classic load balancer but generate the right links right away in my html code.

I also want to ensure that the same url always gets served by the same server.

So I just defined a little function that returns the host to use by hashing the request url and calculates the modulo by the number of servers in use:

function pseudocode_statify($url) { // $url looks like /folder1/folder2/file.jpg
 return 'http://' . md5($url) % $num_of_servers .'.mydomain.com' . $url;
}

I first had something like hex decoding and substring to prevent overflow in place, but found out that it just works fine the way above.

However my problem is that if I run the following test script:

for($i=0;$i<100000;$i++) {
  $md5 = md5(uniqid($i).microtime().rand(1,999999999999));
  $result[$md5%2]++;
}

I expected an even distribution. meaning that $result[0] would be near the value of $result[1];

This was not the case.

Okay, this is nothing special sofar. I would have just accepted the fact that md5 is not as evenly distributed as i thought and would have gone vor another hashing algorithm like sha1 or something.

But I tried to reproduce the findings and found a pattern that I cannot explain.

The ratio was always about 2/1. In fact it was the ratio was always something like 1/2.16 to 1/2.17

Sample output of some runs of the script above:

output was generated by: echo "ratio: ".$result[0]/$result[1]."\n";

ratio: 2.1757121534504
ratio: 2.1729411578062
ratio: 2.1726559360393
ratio: 2.1676895664225
ratio: 2.1667416128848
ratio: 2.1667115284133
ratio: 2.1677791605385
ratio: 2.1658969579688
ratio: 2.1668508131769
ratio: 2.1689292821741

Now the weird thing was that the ratio of sums % 2 equaling 1 and sums % 2 equaling 0 sometimes alternated!

for($j = 0; $j<100;$j++) {
    for($i=0;$i<100000;$i++) {
      $md5 = md5(uniqid($i).microtime().rand(1,999999999999));
      $result[$md5%2]++;
    }
var_dump($result);
}

I ran the script from the command line two sperate times and aborted it after 3 runs and it produced theese two outputs:

joe@joe-laptop:/home/flimmit/httpdocs$ php test.php
PHP Notice:  Undefined variable: result in /home/flimmit/httpdocs/test.php on line 6
PHP Notice:  Undefined offset: 0 in /home/flimmit/httpdocs/test.php on line 6
PHP Notice:  Undefined offset: 1 in /home/flimmit/httpdocs/test.php on line 6
array(2) {
  [0]=>
  int(68223)
  [1]=>
  int(31777)
}
array(2) {
  [0]=>
  int(136384)
  [1]=>
  int(63616)
}
array(2) {
  [0]=>
  int(204498)
  [1]=>
  int(95502)
}
^C
joe@joe-laptop:/home/flimmit/httpdocs$ php test.php
PHP Notice:  Undefined variable: result in /home/flimmit/httpdocs/test.php on line 6
PHP Notice:  Undefined offset: 1 in /home/flimmit/httpdocs/test.php on line 6
PHP Notice:  Undefined offset: 0 in /home/flimmit/httpdocs/test.php on line 6
array(2) {
  [1]=>
  int(31612)
  [0]=>
  int(68388)
}
array(2) {
  [1]=>
  int(63318)
  [0]=>
  int(136682)
}
array(2) {
  [1]=>
  int(94954)
  [0]=>
  int(205046)
}
^C
joe@joe-laptop:/home/flimmit/httpdocs$ 

As you can see in the first one the first entry of results is always higher, in the second one its the other way round. same script.

Strange thing is that i can ONLY reproduce this behaviour when i run the script several times.

I wrote this small script to reproduce the “swapping” and generate enough measure data:

for($j = 0; $j<100;$j++) {
  for($i=0;$i<rand(1000,10000);$i++) {
    $md5 = md5(uniqid($i).microtime().rand(1,99999999));
    $result[$md5%2]++;
    }
    #var_dump($result);
    echo "ratio: ".$result[0]/$result[1]." ".(($result[0]<$result[1]) ? "A":"B")."\n";
    sleep(rand(2,5));
}

But here It only prints b, never A’s. Which made me think there might be a semantic error in the script, but i didnt find any.

I am really stuck and this bothers me a lot.

So my questions:

  • Can you recommend any literature / weblinks were i could read about md5 a little bit deeper including distributions etc

  • Can you explain / reproduce the behaviour? Do I have an error here? (in fact thats very likely but i cant find it)

  • Can you recommend any other algorithm that would besuitable for my use case? It needs not be cryptographic or strong but fast, deterministic and evenly distributed.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-21T02:04:16+00:00Added an answer on May 21, 2026 at 2:04 am

    The md5() function returns a string, not an integer.

    Which means that this string will be type-casted to an integer to do the modulo ; and as this string will contain characters in the 0-9A-F range, casted to an integer, you have :

    • 1 chance out of 16 of getting a 0
    • 9 chances out of 16 of getting between 1 and 9
    • 6 chances out of 16 of getting between A and F — which will be casted to a 0

    For example, this :

    $a = md5('plop1');
    var_dump($a, (int)$a);
    
    $a = md5('plop2');
    var_dump($a, (int)$a);
    
    $a = md5('plop5');
    var_dump($a, (int)$a);
    

    Will get you the following output :

    string 'ac4bf0e466417336599b72a8b2f595da' (length=32)
    int 0
    
    string 'ed91c463402dd797d0718350f5bd0acd' (length=32)
    int 0
    
    string '85782b3afb04072c1bf172a6a7e6bb5e' (length=32)
    int 85782
    

    I’ll let you guess the possible impact this can have on the result of the modulo operator 😉

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Okay, this is probably a very basic question; but, I'm just getting back in
Okay, the title of this topic is really stupid - but I am not
Okay so I know the title is very vague but here is my more
Okay, I know that is a funky-sounding title, but I couldn't think of a
Okay so the title might sound a bit confusing, but here's what I want
Okay that's probably not the best title, I know why we need browser resets:
Okay, that's a presumptuous title - it's complex to me . Overview: (See screenshots
Okay, so this is a very strange problem and it has a lot of
So James Patterson keeps trying to hack my website! Okay, not really, but he
Okay, I'll admit - the title is not the most descriptive/helpful but I couldn't

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.