I am writing something that is similar to a task scheduler. I have two sets of tasks, some which are fixed (they are given a start and end date and time) and some which are not fixed (they are given a start date and time and a duration).
The non-fixed tasks are influenced by the fixed tasks, so that if a non-fixed task is overlapped by a fixed task, the non-fixed task will extend its duration by the amount of overlap.
I start with a list of tuples where the first item is the starting date and the second item is the ID for that fixed task, like this:
[(2012-04-30, 1), (2012-05-01, 5), (2012-05-04, 2)]
I then have another list, which is ordered by the user, of the non-fixed tasks. The idea is that I’ll loop through this list, and inside of that loop I’ll loop through the first list to find the tasks that could overlap with this task, and can figure out which how much to extend the non-fixed task.
Here is where I’m asking for your help. Now that I know the calculated start and end times of this non-fixed task, I need to consider it “fixed” so that it influences the rest of the non-fixed tasks.
I can add this task to the first list of fixed tasks and sort it again, but that means that I’m sorting the list every time I add a task to it.
I can loop through the first list and find the point where this task should be inserted, and then insert it there. But, if its place is early in the list, time is spent shifting all of the other items one place. And if its place is late in the list, I would have to loop through a lot of elements to reach the correct place.
So, I’m not sold on using either of those options. The real question here is: What’s the best way to keep a list sorted while adding things to it? Or is there a much better way of doing all of this?
Here is the example of using bisect and comparison with using the sort of the partially sorted list. The bisect solution clearly wins:
The bisect_solution2() is almost the same as bisect_solution() — only with the code copied-out of the module. Someone else should explain why it takes more time 🙂
The bisect_solution2() is here to be modified for cmp() function to be able to compare the tuples.
It shows the following results on my computer:
Here is a bisect solution adopted for the tuples where date is a string:
Notice that the list size is 10 times less than in the previous as the sort solution was very slow. It prints: