I want create a DataSet class which is basically a list of samples.
But I need to override each insertion operation to the DataSet.
Is there any simple way to do this without writing my own append, extend, iadd etc. ?
UPDATE: I want to add a backpointer to each sample, holding index of the sample in the DataSet. This is needed to the processing algorithm I use. I have a solution, but it seems unelegant – a renumber() function — it ensures that the backpointers are valid.
I don’t know of a way of doing what you’re asking — overriding mutators without overriding them. With a class decorator, however, you can “automate” the overriding versions (assuming each can be achieved by wrapping the corresponding method in the base class), so it’s not too bad…
Suppose for example that what you want to do is add a “modified” flag, true if the data may have been changed since the last call to
.save(a method of yours which persist the data and setsself.modifiedto False).Then…:
This syntax requires Python 2.6 or better, but, in earlier Python versions (ones which only support decorators on
defstatements, not onclassstatements; or even very old ones that don’t support decorators at all), you just need to change the very last part (theclassstatement), to:IOW, the neat decorator syntax is just a small amount of syntax sugar on top of a normal function call which takes the class as the argument and reassigns it.
Edit: now that you have edited your question to clarify your exact requirements — maintain on each item a field
bpsuch that, for alli,theset[i].bp == i— it’s easier to weigh the pro and con of various approaches.You could adapt the approach I sketched, but instead of
self.dirtyassignment before the call to the wrapped method, have aself.renumber()call after it, i.e.:this meets your stated requirements, but in many cases it will do far more work than necessary: for example, when you
appendan item, this needlessly “renumbers” all existing ones (to the same values they already had). But how could any fully automated approach “know” which items, if any, it must recompute the.bpof, withoutO(N)effort? At least it must look at each and every one of them (since you don’t want to separately code, e.g.,appendvsinsert&c), and that’s alreadyO(N).So this will be acceptable only if it’s OK for every single change to the list to be
O(N)(basically only if the list always stays small and/or doesn’t change often).A more fruitful idea might be to not maintain
.bpvalues all the time, but only “just in time” when needed. Makebpa (read-only) property, calling a method which checks if the container is “dirty” (where the “dirty” flag in the container is maintained using the automated code I’ve already given) and only then renumbers the container (and sets its “dirty” attribute toFalse).This will work well when the list typically is subject to a burst of changes and only then do you need to access the items’
bpfor a while, then another bunch of changes, etc. Such bursty alternation between changing and reading is not rare in real-world containers, but only you can know whether it applies in your specific case!To get performance beyond this I think you need to do some manual coding on top of this general approach to take advantage of frequent special cases. For example,
appendmay be called very often, and the amount of work to do in a special-casesappendis really small, so it may well be worth your while to write those two or three lines of code (not setting the dirty bit for that case).One caveat: no approach will work (indeed your requirement becomes self contradictory) if any item is present twice in the list — which of course is perfectly possible unless you take precautions to avoid it (you could easily diagnose it in
renumber— by keeping a set of elements already seen and raising an exception on any duplication — if that’s not too late for you; it’s harder to diagnose “on the fly”, i.e. at the time of a mutation that causes a duplicate, if that’s what you require). Maybe you can relax your requirement so that, if an item is present twice, that’s OK and thebpcan just indicate one of the indices; or makebpinto a set of indices where the element is present (that would also offer a smooth approach to the case of gettingbpfrom an element that’s not in the list). Etc, etc; I recommend you consider (and document!) all of these corner cases in depth — correctness before performance!