I have a set of .csv files that I want to process. It would be far easier to process it with SQL queries. I wonder if there is some way to load a .csv file and use SQL language to look into it with a scripting language like python or ruby. Loading it with something similar to ActiveRecord would be awesome.
The problem is that I don’t want to have to run a database somewhere prior to running my script. I souldn’t have additionnal installations needed outside of the scripting language and some modules.
My question is which language and what modules should I use for this task. I looked around and can’t find anything that suits my need. Is it even possible?
There’s
sqlite3, included into python. With it you can create a database (on memory) and add rows to it, and perform SQL queries.If you want neat ActiveRecord-like functionality you should add an external ORM, like sqlalchemy. That’s a separate download though
Quick example using sqlalchemy:
Now you can query the database, filtering by any field, etc.
Suppose you run the code above on this csv:
That will create and populate a table in memory with fields
name,age,nickname. You can then query the table:That will automatically create and run a
SELECTquery and return the correct rows.Another advantage of using sqlalchemy is that, if you decide to use another, more powerful database in the future, you can do so pratically without changing the code.