I am writing random generation code in CUDA using the CURAND library. What I read about random generation made me believe that if I use the same seed, I will get the same set of random numbers. But its not the case when I tested it. Please explain what am I doing wrong. I am pasting the code below for reference:
curandGenerator_t rand_gen;
status = curandCreateGenerator (&rand_gen ,CURAND_RNG_PSEUDO_DEFAULT );
if(status != CURAND_STATUS_SUCCESS){
printf("Error encountered in generating handle\n");
}
status = curandSetPseudoRandomGeneratorSeed (rand_gen ,1234ULL);
if(status != CURAND_STATUS_SUCCESS){
printf("Error encountered in setting seed\n");
}
for(j=0; j<2; j++){
status = curandGenerate(rand_gen,a_d,N);
if(status != CURAND_STATUS_SUCCESS){
printf("Error encountered in generating random numbers\n");
}
cudaMemcpy ( a_h , a_d , N * sizeof(unsigned int),cudaMemcpyDeviceToHost);
for(i = 0; i < N; i++){
printf("%d : %u\n",i,a_h[i]);
}
printf("-----------%d----------------------\n",j);
}
status = curandDestroyGenerator(rand_gen);
if(status != CURAND_STATUS_SUCCESS){
printf("Error encountered in destroying handle\n");
}
Output:
0 : 624778773
1 : 3522650202
2 : 2363946744
3 : 1266286439
4 : 3928747533
5 : 3732235839
6 : 1382638835
7 : 3362343509
8 : 48542993
9 : 1225999208
———–0———————-
0 : 3356973615
1 : 1004333919
2 : 2916556602
3 : 1213079917
4 : 2705410958
5 : 520650207
6 : 1860816870
7 : 1645310928
8 : 2205755199
9 : 1282999252
———–1———————-
there is a notion of “state” of a pseudo-random generator. For example, Mersenne twister has a state of size about 1024 words while the the default one XORWOW has a state size just a several words (but it also has much smaller period).
Whenever you call ‘setPseudoRandomGeneratorSeed’ you initialize the state of the generator. Then with subsequent calls to curandGenerate this state will be updated (i.e. to go from one random number to the next one, the state needs to be recomputed) and hence different parts of the random sequence will be generated.
You might also experiment with the driver API – here curandInit() initializes the state for each thread which can be quite expensive. Then the subsequent calls to curand() or curandUniform() etc. will reuse this state. In fact each thread starts from different offset of a random sequence