I know generally two ways to design a generic linked list datastructure in C. And I’m wondering which is better. Before asking the question, I’ll introduce both methods shortly:
One method is to build functions around a structure like the following:
struct list_element {
struct list_element *prev;
struct list_element *next;
void *data;
};
Obviously, the data pointer points to the payload. The list element struct is outside the payload. This is e.g. how glib has designed its double linked list facility: http://library.gnome.org/devel/glib/2.26/glib-Doubly-Linked-Lists.html
Another method is the way how it’s done in the Linux kernel: http://isis.poly.edu/kulesh/stuff/src/klist/. There is no void pointer to the payload in the list element struct. Instead the list element struct is included in the payload struct:
struct list_element {
struct list_element *prev;
struct list_element *next;
};
struct person {
char name[20];
unsigned int age;
struct list_element list_entry;
};
A special macro is used to get a pointer to the payload struct given a pointer to the list_entry, its name withing the payload struct and the type of the payload struct (the list_entry() macro).
Finally, here is the question: What is the advantage of the latter of the two methods of constructing a linked list? A few times I’ve heard people say the second is more ‘generic’ than the first, but why? I would even argue that the first method is more generic because the payload structures are list implementation agnostic, which isn’t the case with the second method.
One more downside of the second method is if you want to place the payload in more than one list, you should a struct list_element member for each list in the payload structure.
Edit:
To summarize so far I saw two answers which were important for me:
- With the first method: removing a payload from the list involves looping through the complete list until the list element pointing to the payload is found. You don’t need to do this with the second method. (Answer from Patrick)
- With the first method you have to do two malloc()s for each element: one for the payload and one for the list element struct. With the second method one malloc() is sufficient. (Answer from Roddy)
The first approach might seem less intrusive but in many cases it isn’t (unless you add additional data structures).
Imagine you have a list of thousand persons and you want to remove one of them from the list. If the person doesn’t know where it is in the list, you will have to scan the whole list first to get the exact place of the person.
You can solve this by adding a pointer from the person to its corresponding list structure, but this defeats the non-intrusiveness (does this word exist?) of the solution.
Another alternative is to have a hash map that maps the memory addresses of the persons to the memory addresses of the list nodes. Then finding the node in the list is much faster (but still slower than the intrusive way). However, since this will take a even more memory, I suggest not to do this.
Therefore, the easiest and simplest solution is the second one.