In a monte-carlo simulation I store a summary of each run in a data file, in which each column contains either a parameter or one of the result values. So I end up with a large data file in which up to 40 columns of data is stored, in which many rows don’t have anything to do with others.
Say, for example, this file looks like that:
#param1 param2 result1 result2
1.0 1.0 3.14 6.28
1.0 2.0 6.28 12.56
...
2.0 1.0 1.14 2.28
2.0 2.0 2.28 4.56
Since I often want to study the dependence of one of the results on one of the parameters, I both need to group by the 2nd parameter and sort by the 1st one. Also, I might want to filter out rows depending on any parameters.
I started writing my own class for this, but it seems harder than one might guess. Now my question: Is there any library, that does this already? Or, since I am familiar with SQL, would it be difficult to write an SQL backend for, say, SQLAlchemy, that allows me to do simple SQL queries on my data? As far as I know, this would provide everything I need.
Based on the answer of cravoori (or at least the one in the link he/she posted), here is a nice and short solution to my problem:
#!/usr/bin/python2
import numpy as np
import sqlite3 as sql
# number of columns to read in
COLUMNS = 31
# read the file. My columns are always 18chars long. the first line are the names
data = np.genfromtxt('compare.dat',dtype=None,delimiter=18, autostrip=True,
names=True, usecols=range(COLUMNS), comments=None)
# connect to the database in memory
con = sql.connect(":memory:")
# create the table 'data' according to the column names
con.execute("create table data({0})".format(",".join(data.dtype.names)))
# insert the data into the table
con.executemany("insert into data values (%s)" % ",".join(['?']*COLUMNS),
data.tolist())
# make some query and create a numpy array from the result
res = np.array(con.execute("select DOS_Exponent,Temperature,Mobility from data ORDER \
BY DOS_Exponent,Temperature ASC").fetchall())
print res
Seeing that the data is delimited, one option is to import the file into an in-memory SQLite database via the csv module, example linked below. Sqlite supports most SQL clauses
Import data into SQLite db