I wanted to use pandas like SQL for a web app (instead of holding the data in pSQL, just hold it in a pandas DataFrame since the data is just under 1GB and is not changing constantly). If I am doing a look up based on multiple filters on columns (eg. age > x, age < y, income > p, income < q) are there any ways to speed up this filtering? or is it already done below. In SQL one would declare an index on age and income to speed up such a query, I am wondering what is the pandas way of doing this if any.
Share
The “pandas way” of doing this query is:
pandas indexes everything by default (including all columns), so you don’t need to explicitly declare beforehand what you are going to be querying.
(I can’t say whether this set up would makes sense for your dataset.)