We have a CSV file that we are importing into a Django application, and then creating the appropriate models and relationships.
At the first page, we have a file upload form where the user selects a file.
We then parse the file, and return a second page showing them what would be created, any validation errors etc.
The user can then decide whether to proceed or not (or possibly to correct any areas on-screen).
What would be the best way of storing the temporary interim models, before it actually hits the database proper?
The CSV file will be fairly big, possibly around 200 Kb in size, and create several hundred models.
Should I store this in the database somewhere, and label those models “temporary”? It seems a bit heavy just for a confirm, and I’m not sure if it’s appropriate use of the database. Or is there some way we could store it in Django sessions? Or any other way to do it?
I mentioned this on django-users before, and they suggested there either using a separate DB, or storing it in MongoDB. I’m not quite sure the best way to persist a Django model to MongoDB in that way though.
They also mentioned I might need to use something like ZeroMQ or django-celery to handle the importing process asynchronously in case the web-server timed out during it.
Anyhow, figured I’d also canvas the SO community, since there’s a lot of bright Django people lurking here as well =).
Cheers,
Victor
If this csv import process will be common and perpetual (ie used by many users and not limited for a brief period of time), I would prefer to incorporate the logic into the models: I’d modify the models to have a flag indicating the instance is active or pending (you may add deleted too). This can be complemented with multiple managers. The default manager may be modified to filter only active objects, whereas a second manager may fetch only pending objects. A job may be written to take care of the objects that have been pending for long.
Otherwise (if this is a temporary situation), you may choose any medium you are comfortable with to persist the pending objects.