I’m trying to go through a huge graph (around 875000 nodes and 5200000 edges) but I’m getting a stackoverflow.
I have a recursive function to loop through it. It will explore only the non-explored nodes so there is no way it goes into an infinite recursion. (or at least I think)
My recursive function works for smaller inputs (5000 nodes).
What should I do? Is there a maximum number of successful recursive call?
I’m really clueless.
EDIT: I have posted the iterative equivalent at the end as well.
Here is the code of the recursion:
int main()
{
int *sizeGraph,i,**reverseGraph;
// some code to initialize the arrays
getGgraph(1,reverseGraph,sizeGraph); // populate the arrays with the input from a file
getMagicalPath(magicalPath,reverseGraph,sizeGraph);
return 0;
}
void getMagicalPath(int *magicalPath,int **graph,int *sizeGraph) {
int i;
int *exploredNode;
/* ------------- creation of the list of the explored nodes ------------------ */
if ((exploredNode =(int*) malloc((ARRAY_SIZE + 1) * sizeof(exploredNode[0]))) == NULL) {
printf("malloc of exploredNode error\n");
return;
}
memset(exploredNode, 0, (ARRAY_SIZE + 1) * sizeof(exploredNode[0]));
// start byt the "last" node
for (i = ARRAY_SIZE; i > 0; i--) {
if (exploredNode[i] == 0)
runThroughGraph1stLoop(i,graph,exploredNode,magicalPath,sizeGraph);
}
free(exploredNode);
}
/*
* run through from the node to each adjacent node which will run to each adjacent node etc...
*/
void runThroughGraph1stLoop(int node,int **graph,int *exploredNode,int *magicalPath,int *sizeGraph) {
//printf("node = %d\n",node);
int i = 0;
exploredNode[node] = 1;
for (i = 0; i < sizeGraph[node]; i++) {
if (exploredNode[graph[node][i]] == 0) {
runThroughGraph1stLoop(graph[node][i],graph,exploredNode,magicalPath,sizeGraph);
}
}
magicalPath[0]++; // as index 0 is not used, we use it to remember the size of the array; quite durty i know
magicalPath[magicalPath[0]] = node;
}
The iterative equivalent of the above:
struct stack_t {
int node;
int curChildIndex;
};
void getMagicalPathIterative(int *magicalPath,int **graph,int *sizeGraph) {
int i,k,m,child,unexploredNodeChild,curStackPos = 0,*exploredNode;
bool foundNode;
stack_t* myStack;
if ((myStack = (stack_t*) malloc((ARRAY_SIZE + 1) * sizeof(myStack[0]))) == NULL) {
printf("malloc of myStack error\n");
return;
}
if ((exploredNode =(int*) malloc((ARRAY_SIZE + 1) * sizeof(exploredNode[0]))) == NULL) {
printf("malloc of exploredNode error\n");
return;
}
memset(exploredNode, 0, (ARRAY_SIZE + 1) * sizeof(exploredNode[0]));
for (i = ARRAY_SIZE; i > 0; i--) {
if (exploredNode[i] == 0) {
curStackPos = 0;
myStack[curStackPos].node = i;
myStack[curStackPos].curChildIndex = (sizeGraph[myStack[curStackPos].node] > 0) ? 0 : -1;
while(curStackPos > -1 && myStack[curStackPos].node > 0) {
exploredNode[myStack[curStackPos].node] = 1;
if (myStack[curStackPos].curChildIndex == -1) {
magicalPath[0]++;
magicalPath[magicalPath[0]] = myStack[curStackPos].node; // as index 0 is not used, we use it to remember the size of the array
myStack[curStackPos].node = 0;
myStack[curStackPos].curChildIndex = 0;
curStackPos--;
}
else {
foundNode = false;
for(k = 0;k < sizeGraph[myStack[curStackPos].node] && !foundNode;k++) {
if (exploredNode[graph[myStack[curStackPos].node][k]] == 0) {
myStack[curStackPos].curChildIndex = k;
foundNode = true;
}
}
if (!foundNode)
myStack[curStackPos].curChildIndex = -1;
if (myStack[curStackPos].curChildIndex > -1) {
foundNode = false;
child = graph[myStack[curStackPos].node][myStack[curStackPos].curChildIndex];
unexploredNodeChild = -1;
if (sizeGraph[child] > 0) { // get number of adjacent nodes of the current child
for(k = 0;k < sizeGraph[child] && !foundNode;k++) {
if (exploredNode[graph[child][k]] == 0) {
unexploredNodeChild = k;
foundNode = true;
}
}
}
// push into the stack the child if not explored
myStack[curStackPos + 1].node = graph[myStack[curStackPos].node][myStack[curStackPos].curChildIndex];
myStack[curStackPos + 1].curChildIndex = unexploredNodeChild;
curStackPos++;
}
}
}
}
}
}
Typically you shouldn’t rely on too deep recursion. Different platforms handle this differently, but generally it is roughly like this:
The
stack memoryvariable is very different from system to system. Some OS may just use a fixed amount of main memory, others may allow a growing stack, some may use page files and swap memory for growing and put no limit at all. As a C programmer with the abstract C standard you cannot rely on anything.So you could optimize the function state first (rid off variables, use smaller integers, etc.). But that might not be the real solution.
Some compilers recognize tail recursion and transform recursion into iteration. But again, this isn’t something to rely on (the C Standard doesn’t guarantee it; a language where you can rely on this would be Common LISP). See also Does C++ limit recursion depth? as a related question.
Compilers may offer options to set recursive limits. But once again, one shouldn’t rely on it if your deepness is effectively unlimited by design.
But the real solution is to manually transform your recursion to iteration. The simplest way would be store all function-internal data in a stack and emulate your recursion by hand:
To (Pcode to get you the point):
While this usually works better than recursion, this might not be the smartest solution and you should start to work out which state really has to be conserved.
In our example we find that we always only need the top of the stack, so we instantly rid the stack again:
And make it even more beautyful:
This one of the classic, non-recursive factorial implementations.
So in summary, the general recipe: Begin with putting your function’s state into a stack, and then go on with refactoring.