How can I write my own aggregate functions with SQLAlchemy? As an easy example I would like to use numpy to calculate the variance. With sqlite it would look like this:
import sqlite3 as sqlite
import numpy as np
class self_written_SQLvar(object):
def __init__(self):
import numpy as np
self.values = []
def step(self, value):
self.values.append(value)
def finalize(self):
return np.array(self.values).var()
cxn = sqlite.connect(':memory:')
cur = cxn.cursor()
cxn.create_aggregate("self_written_SQLvar", 1, self_written_SQLvar)
# Now - how to use it:
cur.execute("CREATE TABLE 'mytable' ('numbers' INTEGER)")
cur.execute("INSERT INTO 'mytable' VALUES (1)")
cur.execute("INSERT INTO 'mytable' VALUES (2)")
cur.execute("INSERT INTO 'mytable' VALUES (3)")
cur.execute("INSERT INTO 'mytable' VALUES (4)")
a = cur.execute("SELECT avg(numbers), self_written_SQLvar(numbers) FROM mytable")
print a.fetchall()
>>> [(2.5, 1.25)]
The creation of new aggregate functions is backend-dependant, and must be done
directly with the API of the underlining connection. SQLAlchemy offers no
facility for creating those.
However after created you can just use them in SQLAlchemy normally.
Example:
Now for the testing:
That prints (with SQLAlchemy statement echo turned on):
Note that I didn’t use SQLAlchemy’s ORM (just the sql expression part of SQLAlchemy was used) but you could use ORM just as well.