My objective is to read an XML text file and split each word and tag into there own line in an array.
For example, if I input this text into my program:
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
I would get this:
<note>
<to>
Tove
</to>
<from>
...
Right now I have code that can successfully do this but only with the words so instead of the above list I get:
note
to
Tove
...
I want to keep the tags or I wont be able to do what I want with it. So I have been trying to get it to also add the tags but have been failing
Okay so here is my code:
//While the file is not empty
while(fgets(buffer, sizeof(buffer), stdin) != NULL){
int first = 0;
int last = 0;
//While words are left in line
while(last < INITIAL_SIZE && buffer[last] != '\0'){
int bool = 0;
//Tag detected
if(buffer[last] == '<'){
while(buffer[last] != '>'){
last++;
}
bool = 1;
}else{
//While more chars are in the word
while(last < INITIAL_SIZE && isalpha(buffer[last])){
last++;
}
}
//Word detected
if(first < last){
//Words array is full, add more space
if(numOfWords == sizeOfWords){
sizeOfWords = sizeOfWords + 10;
words = (char **) realloc(words, sizeOfWords*sizeof(char *));
}
//Allocate memory for array
words[numOfWords] = (char *) calloc(last-first+1, sizeof(char));
for(i = 0; i < (last-first); i++){
words[numOfWords][i] = buffer[first + i];
}
//Add terminator to "new word"
words[numOfWords][i] = '\0';
numOfWords++;
}
//Move "Array Pointers" accordingly
last++;
first = last;
}
}
Any one have any idea, with the above code this is the printout:
<note
<to
Tove
to
<from
Jani
from
<heading
...
Don
t
forget
me
this
weekend
</body
</note
So after this wall of text, does anyone have any idea on how I can modify my current code to get this to work? Or does anyone else have an alternative?
Even though it is highly doubtful that anyone would ever use this I got it to work by using Boolean type logic.