I’m writing a script which receives two lists of data and looks for the differences, in order to update data into a database.
These lists are non-homogeneous: one is a list of database objects, the other is a list of dictionaries.
For many reasons, this tool offers the opportunity to preview the list of the differences before applying changes.
After analyzing the preview, if the update is accepted by a moderator, changes will be applied.
That means that the script, containing a lot of loops and conditional tests, will generate a list of preview (served to an output page) and a list of changes.
At the first stage of developing, I wrote a single script which would run in two modes: ‘preview’ and ‘update’.
This script performed all loops and conditional checks, and if changes were found at some point, it would add a message to an output string if running in ‘preview’ mode, or execute a command if running in ‘update’ mode.
Then I’ve began to think that maybe it would be better to pass between loops and conditional checks only once, adding preview messages to an output string and commands to a command list, whenever a change is found.
Then, after serving preview page to the moderator, if changes are accepted, run a second script, which would be such simple as:
def apply_changes(command_list, ...):
for c in command_list:
try:
exec(c)
except Exception, err:
logger.error('Something went wrong while executing command %s: %s', c, err)
raise Exception, err
Q: Would be ok to switch to the second version of the script? Which cautions/errors/strange_behaviors would involve?
I’ve been asking almost the same question under the language-agnostic tag,
because I was more interested in the algorithmical point of view of the problem than in its implementation.
But focusing more on the implementational point of view, seems that with Python the second version of this script performs better and is easier to maintain than the first one.
Are there any reasons for which I should prefer the first version?
EDIT: Adding some code, in order to clarify the question even more.
An excerpt from first version code would be something like this function (used in the easier possible case, in which the values compared are both strings), called by other functions inside of a nested for loop if some specific conditions are met:
def update_text_field(object, old_value, new_value, field_name, mode):
....
if old_value != new_value:
if mode is 'preview': output += print_old_new_values(old_value, new_value, field_name)
if mode is 'update':
if hasattr(object, field_name):
setattr(object, field_name, new_value)
object.save()
else:
...
....
In the second version, this excerpt would transform into something like this:
def update_text_field(object, old_value, new_value, field_name, mode):
....
if old_value != new_value:
output_list.append(print_old_new_values(old_value, new_value, field_name))
command_list.append(generate_command(command, old_value, new_value, field_name))
...
One reason why you would prefer the first version is to have the ability to run the script in
updatemode only (skipping the mandatory-in-second-scriptpreviewstep).Here’s a suggestion as to how to make the script easier to maintain: Abstract away the mode when the script is running. This can be done like this:
And then you can define both modes (this is just a simplistic example):
And optionally define convenience wrappers:
And even easily define new modes:
This is possible due to first-class functions in Python. This way, the only things you need to maintain are very cleanly separated and don’t have to care about one another:
previewmodeupdatemodeNo duplicate version of the function, no possibly-expensive-in-the-long-run branching inside the diff function, easy as pie to maintain.
If having the ability to run in
updatemode directly without going throughpreviewis not important, then the best performance option is indeed to build a list of commands inpreviewmode and then run it in theupdatemode. It is about as easy to maintain as the above solution. However, even if you do go for this option, consider how easy it would be to implement using the above pattern: