I have this specific function to extract parts of a list in the form: Give[list, elem] returns the part of list that corresponds to the position of elem in a global $Reference variable (if defined). I use this function heavily throughout my code, so I decided to optimize it. This is where I managed to get so far, but frankly, I have no idea how to advance.
ClearAll[Give, $Reference, set];
Give::noref = "No, non-list or empty $Reference was defined to refer to by Give.";
Give::noelem = "Element (or some of the elements in) `1` is is not part of the reference set `2`.";
Give::nodepth = "Give cannot return all the elements corresponding to `1` as the list only has depth `2`.";
give[list_, elem_List, ref_] := Flatten[Pick[list, ref, #] & /@ elem, 1];
give[list_, elem_, ref_] := First@Pick[list, ref, elem];
Options[Give] = {Reference :> $Reference}; (* RuleDelayed is necessary, for it is possible that $Reference changes between two subsequent Give calls, and without delaying its assignment, ref would use previous value of $Reference instead of actual one. *)
Give[list_List, elem___, opts___?OptionQ] := Module[{ref, pos},
ref = Reference /. {opts} /. Options@Give;
Which[
Or[ref === {}, Head@ref =!= List], Message[Give::noref]; {},
Complement[Union@Flatten@{elem}, ref] =!= {}, Message[Give::noelem, elem, ref]; {},
Length@{elem} > Depth@list - 1, Message[Give::nodepth, {elem}, Depth@list]; {},
True, Fold[give[#1, #2, ref] &, list, {elem}]
]];
In[106]:= $Reference = {"A", "B", "C"};
set = {{1, 2, 3}, {4, 5, 6}, {7, 8, 9}};
Give[set, "B"](* return specified row *)
Out[108]= {4, 5, 6}
In[109]:= Give[set, "B", "A"] (* return entry at specified row & column *)
Out[109]= 4
In[110]:= Give[set, {"B", "A"}] (* return multiple rows *)
Out[110]= {{4, 5, 6}, {1, 2, 3}}
I’ve decided to drop distinct signature function calls, as the list version might call the non-list version, which means that error handling has to be done multiple times (for each element in the list). Sadly, the error handling cannot be discarded. If the improved version is more robust (can e.g. handle more dimensions), that’s not a problem, however the examples above will suffice.
In[139]:= First@Timing[Give[set, RandomChoice[$Reference, 10000]]] (* 1D test *)
Out[139]= 0.031
In[138]:= First@Timing[Table[Give[set, Sequence @@ RandomChoice[$Reference, 2]], {10000}]] (* 2d test *)
Out[138]= 0.499
I’m sure this is not efficient code, so feel free to improve it. Any help is appreciated, even if it trims off only a few nanoseconds.
The main efficiency problem for large lists seems to come from mapping
Pick. This can be avoided if you replace the corresponding definition forgivewith this one:Here is my test code:
You get 10 – fold speed increase here, for this use case. I did not test the 2D one, but would guess it should help there too.
EDIT
You could further improve performance by caching the dispatched table for
$Reference(Dispatch[Thread[ref->Range[Length[$Reference]]]) once at the start in the body ofGive, and then pass it togive(either explicitly or by makinggivean inner function – throughModulevariables – which would refer to it), so that you don’t have to recompute it in case when you callgiveseveral times throughFold. You can also do that conditionally, say of you have large lists of elements inelem, to justify the time needed to create the dispatch table.