Working on a rules agnostic poker simulator for fun. Testing bottlenecks in enumeration, and

Question

0

Asked: June 3, 20262026-06-03T07:21:08+00:00 2026-06-03T07:21:08+00:00

Working on a rules agnostic poker simulator for fun. Testing bottlenecks in enumeration, and

0

Working on a rules agnostic poker simulator for fun. Testing bottlenecks in enumeration, and for hands that would always get pulled from the “unique” array, I found an interesting bottleneck. I measured the average computation time of running each of the variations below 1,000,000,000 times and then took the best of 100 repetitions of that to allow JIT and Hotspot to work their magic. What I found was there’s a difference in computation time (6ns vs 27ns) between

public int getRank7(int ... cards) {
  int q = (cards[0] >> 16) | (cards[1] >> 16) | (cards[2] >> 16) | (cards[3] >> 16) | (cards[4] >> 16) | (cards[5] >> 16) | (cards[6] >> 16);
  int product = ((cards[0] & 0xFF) * (cards[1] & 0xFF) * (cards[2] & 0xFF) * (cards[3] & 0xFF) * (cards[4] & 0xFF) * (cards[5] & 0xFF) * (cards[6] & 0xFF));
  if(flushes[q] > 0) return flushes[q];
  if(unique[q] > 0) return unique[q];
  int x = Arrays.binarySearch(products, product);
  return rankings[x];
}

and

public int getRank(int ... cards) {
  int q = 0;
  long product = 1;
  for(int c : cards) {
    q |= (c >> 16);
    product *= (c & 0xFF);
  }
  if(flushes[q] > 0) return flushes[q];
  if(unique[q] > 0) return unique[q];
  int x = Arrays.binarySearch(products, product);
  return rankings[x];
}

The issue is definitely the for loop, not the addition of handling multiplication at the top of the function. I’m a little baffled by this since I’m running the same number of operations in each scenario… I realized I’d always have 6 or more cards in this function so I brought things closer together by changing it to

public int getRank(int c0, int c1, int c2, int c3, int c4, int c5, int ... cards)

But I’m going to have the same bottleneck as the number of cards goes up. Is there any way to get around this fact, and if not, could somebody explain to me why a for loop for the same number of operations is so much slower?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-03T07:21:10+00:00

I think you’ll find that the big difference is branching. Your for loop scenario requires a check and conditional branch on each iteration of the for loop. Your CPU will try and predict which branch will be taken, and pipeline instructions accordingly, but when it mispredicts (at least once per function call, as the loop terminates), the pipeline stalls, which is very expensive.

One thing to try would be a regular for loop with a fixed upper bound (rather than one based on the length of the array); the Java JRE may unroll such a loop, which would result in the same sequence of operations as your more efficient version.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Working on a rules agnostic poker simulator for fun. Testing bottlenecks in enumeration, and

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply