Python has a lot of convenient data structures (lists, tuples, dicts, sets, etc) which can be used to make other ‘conventional’ data structures (Eg, I can use a Python list to create a stack and a collections.dequeue to make a queue, dicts to make trees and graphs, etc).
There are even third-party data structures that can be used for specific tasks (for instance the structures in Pandas, pytables, etc).
So, if I know how to use lists, dicts, sets, etc, should I be able to implement any arbitrary data structure if I know what it is supposed to accomplish?
In other words, what kind of data structures can the Python data structures not be used for?
Thanks
For some simple data structures (eg. a stack), you can just use the builtin list to get your job done. With more complex structures (eg. a bloom filter), you’ll have to implement them yourself using the primitives the language supports.
You should use the builtins if they serve your purpose really since they’re debugged and optimised by a horde of people for a long time. Doing it from scratch by yourself will probably produce an inferior data structure. Whether you’re using Python, C++, C#, Java, whatever, you should always look to the built in data structures first. They will generally be implemented using the same system primitives you would have to use doing it yourself, but with the advantage of having been tried and tested.
Combinations of these data structures (and maybe some of the functions from helper modules such as heapq and bisect) are generally sufficient to implement most richer structures that may be needed in real-life programming; however, that’s not invariably the case.
Only when the provided data structures do not allow you to accomplish what you need, and there isn’t an alternative and reliable library available to you, should you be looking at building something from scratch (or extending what’s provided).
Lets say that you need something more than the rich python library provides, consider the fact that an object’s attributes (and items in collections) are essentially “pointers” to other objects (without pointer arithmetic), i.e., “reseatable references”, in Python just like in Java. In Python, you normally use a
Nonevalue in an attribute or item to represent whatNULLwould mean in C++ ornullwould mean in Java.So, for example, you could implement binary trees via, e.g.:
plus methods or functions for traversal and similar operations (the
__slots__class attribute is optional — mostly a memory optimization, to avoid eachNodeinstance carrying its own__dict__, which would be substantially larger than the three needed attributes/references).Other examples of data structures that may best be represented by dedicated Python classes, rather than by direct composition of other existing Python structures, include
tries(see e.g. here) andgraphs(see e.g. here).