I’m writing a C++ application to do a word search across a large database of song lyrics. to start, I’m taking each word and putting it into a Word struct that looks like this:
struct Word{
char* clean;
int size;
int position;
SongId id;
Word* same;
Word* diff;
};
I have a “makeNode” function that does the following:
- takes in each word
- creates a new Word struct and adds the word to it
- creates a Word* called node which points to the new word
- stores the pointer in a hash table.
In my makeNode function, I set node->clean to my “clean” word. I can print the word by cout’ing node->clean. But when I set node->same to NULL, I lose node->clean. I don’t lose node->position or node->size. If I remove the line where I assign node->same to to NULL, I do not lose node->clean.
char* clean = cleanse(word);
Word* node = new Word;
node->size = strlen(word);
node->clean = clean;
cout<<"MADE NODE FOR "<<node->clean<<endl;
node->position = position;
cout<<"4 node clean: "<<node->clean<<endl;
node->id = id;
cout<<"5 node clean: "<<node->clean<<endl;
node->same = NULL;
cout<<"6 node clean: "<<node->clean<<endl;
cout<<"node position: "<<node->position<<endl;
cout<<"node size: "<<node->size<<endl;
node->diff = NULL;
yields the following output:
MADE NODE FOR again
4 node clean: again
5 node clean: again
6 node clean:
node position: 1739
node size: 6
0 node clean:
1 node clean:
3 node clean:
Can anyone help me get past this error? If you need more info, let me know. Thanks in advance!
EDIT: here is the cleanse function.
char* SongSearch::cleanse(char* dirty)
{
string clean;
int iter = 0;
while (!isalnum(dirty[iter]))
{
iter++;
}
while(dirty[iter]!='\0')
{
clean += dirty[iter];
iter++;
}
int backiter = clean.length() - 1;
while(!isalnum(clean[backiter]))
{
clean.erase(backiter, 1);
backiter--;
}
char c;
for (int i = 0; i<clean.length(); i++)
{
c = tolower(clean[i]);
clean[i] = c;
}
char* toReturn = (char*)(clean.c_str());
return toReturn;
}
The problem is probably that in
cleanse, you returnclean.c_str().That pointer value ceases to be valid when
cleanceases to exist, which is when the function exits. It is no longer guaranteed to point to anything, so it’s pure luck that you’re ever seeing the string “again” as expected.What I suspect happens is that the memory that used to be occupied by the data for the string
cleanincleanse, has been re-used for the structureword, but is not immediately overwritten. It just so happens that the byte that used to hold the firstanow holds part of thesamemember of your struct. So, when you write a null pointer tonode->same, it has the effect of writing a 0 byte to the location pointed to bynode->clean. Thereafter, it appears to point to an empty string.