To solve my problem it is natural to use base 3 numbers. I have several tables indexed by base 3 numbers, and at some point I need to go through all the indexes that differ in k digits from a given N-digit number.
For example, given 120 as a 3-digit base 3 number, the numbers differing in 1 digit would be:
020
220
100
110
121
122
I have some ugly code that does this the obvious way, but it is slow and hard to parallelize. Any idea how to do this efficiently?
(preferred language: c++)
Here is code in Mathematica. Documentation about the individual commands can be found in the Mathematica documentation center.
A dissection of this code:
Assuming you want to include numbers with leading zeros the function gets the number of digits (n) as an argument. If you don’t do this the splitting of the number in its individual digits won’t generate n digits if it has leading zero’s. The second line converts a number like 2110 to a list {{2},{1},{1},{0}}.
IntegerDigitsdoes the splitting andList /@mapsListover the resulting digits, placing the extra curly brackets that we will need later.Some of these sublists will be replaced (/. is the replacement operator, which replacements take part is determined by the list of positions in ss) by the set of complementary base 3 digits so that the command
Tuplescan make all possibles sets from them. For exampleTuples[{{1,2},{3},{4,5}}]-==> {{1, 3, 4}, {1, 3, 5}, {2, 3, 4}, {2, 3, 5}}The
Tuplesis at the end of the line. The first part is a pure function that acts on the result of theTuplesfunction to turn it in a number again withFromDigitsand to take care of leading zeros usingIntegerString(the result is a string therefore, to allow for leading zeros).The heart is the generation of the table of these tuples based on finding all possible replacement positions. This is done with the line
Subsets[Range[len], {k}]which generates all subsets of a list {1,2,…,n} made by picking k numbers. TheParallelTablecycles over this list using the generated positions to replace all applicable digits at these positions to lists of possible counterparts. Generating this list of digit-change position seems a natural approach to parallelize the problem as you can dedicate pieces of the list to various cores.ParallelTableis a parallel computing variant of Mathematica’s standardTablefunction which takes care of this parallelization automatically.Since every set of positions that ss takes generates a list of resulting numbers the end result is a list of lists.
Flattenflattens this out to one list of numbers.So, we find half a million sets in 0.671 seconds. If I change
ParallelTableinTableit takes 3.463 seconds which is about 5 times slower. A bit surprising as I only have 4 cores, and usually parallel overhead eats up a considerable portion of potential speed gains.