Possible Duplicate:
Parsing text in C
Say I have written to a text file in this format:
key1/value1
key2/value2
akey/withavalue
anotherkey/withanothervalue
I have a linked list like:
struct Node
{
char *key;
char *value;
struct Node *next;
};
to hold the values. How would I read key1 and value1? I was thinking of putting line by line in a buffer and using strtok(buffer, ‘/’). Would that work? What other ways could work, maybe a bit faster or less prone to error? Please include a code sample if you can!
Since your problem is a very good candidate for optimizing memory fragmentation, here is an implementation that uses some simple arcane magic to allocate all strings and the structure itself in a single piece of memory.
When destroying the node, you need only a single call to
free(), to the node itself.Explanation:
With 20 comments and one unexplained downvote, I think that the code needs some explanation, specially with regards to the tricks employed:
Building a linked list:
This is a trick to create a linked list iteratively in forward order without having to special-case the head of the list. It uses a pointer-to-pointer to the next node. First the
nextppointer points to the list head pointer; in the first iteration, the list head is set through this pointer-to-pointer and thennextpis moved to the next pointer of that node. Subsequent iterations fill the next pointer of the last node.Single allocation:
We have to deal with three pointers: the node itself, the key string and the value string. This usually would require three separate allocations (malloc, calloc, strdup…), and consequently free separate releases (free). Instead, in this case, the spaces of the tree elements are summed in
sizeof(struct Node) + strlen(buffer) + 1and passed to a singlemalloccall, which returns a single block of memory. The beginning of this block of memory is assigned tonode, the structure itself. The additional memory (strlen(buffer)+1) comes right after the node, and it’s address is obtained using pointer arithmetic usingnode+1. It is used to make a copy of the entire string read from the file (“key/value\n”).Since
mallocis called a single time for each node, a single allocation is made. It means that you don’t need to callfree(node->key)andfree(node->value). In fact, it won’t work at all. Just a singlefree(node)will take care of deallocating the structure and both strings in one block.Line parsing:
The first call to
strtokreturns the pointer to the beginning of the buffer itself. It looks for a ‘/’ (additionally for end-of-line markers) and breaks the string there with a NUL character. So the “key/value\n” is broken in “key” and “value\n” with a NUL character in between, and a pointer to the first is returned and stored innode->key. The second call tostrtokwill work upon the remaining “value\n”, strip the end-of-line marker and returning a pointer to “value”, which is stored innode->value.I hope this cleans all questions about the above solution… it is too much for a closed question. The complete test code is here.