The following is the problem description:
let c[n] be the catalan number for n and p be a large prime eg.1000000007
I need to calculate c[n] % p where n ranges from {1,2,3,...,1000}
The problem which I am having is that on a 32 bit machine you get overflow when you calculate catalan number for such large integer. I am familiar with modulo arithmetic. Also
(a.b) % p = ((a % p)(b % p)) % p
this formula helps me to get away with the overflow in numerator separately but I have no idea how to deal with denominators.
For a modulus of 1000000007, avoiding overflow with only 32-bit integers is cumbersome. But any decent C implementation provides 64-bit integers (and any decent C++ implementation does too), so that shouldn’t be necessary.
Then to deal with the denominators, one possibility is, as KerrekSB said in his comment, to calculate the modular inverse of the denominators modulo the prime
p = 1000000007. You can calculate the modular inverse with the extended Euclidean algorithm or, equivalently, the continued fraction expansion ofk/p. Then instead of dividing bykin the calculation, you multiply by its modular inverse.Another option is to use Segner’s recurrence relation for the Catalan numbers, which gives a calculation without divisions:
Since you only need the Catalan numbers
C(k)fork <= 1000, you can precalculate them, or quickly calculate them at program startup and store them in a lookup table.If contrary to expectation no 64-bit integer type is available, you can calculate the modular product by splitting the factors into low and high 16 bits,
To calculate
a*b (mod m)withm <= (1 << 31), reduce each of the four products modulom,and the simplest way to incorporate the shifts is
the same for
p3and with 32 iterations forp4. ThenThat way is not very fast, but for the few multiplications needed here, it’s fast enough. A small speedup should be obtained by reducing the number of shifts; first calculate
(p4 << 16) % m,then all of
p2,p3and the current value ofp4need to be multiplied with 216 modulom,