I have a model like this:
class Stock(models.Model):
product = models.ForeignKey(Product)
place = models.ForeignKey(Place)
date = models.DateField()
quantity = models.IntegerField()
I need to get the latest (by date) quantity for every product for every place,
with almost 500 products, 100 places and 350000 stock records on the database.
My current code is like this, it worked on testing but it takes so long with the real data that it’s useless
stocks = Stock.objects.filter(product__in=self.products,
place__in=self.places, date__lt=date_at)
stock_values = {}
for prod in self.products:
for place in self.places:
key = u'%s%s' % (prod.id, place.id)
stock = stocks.filter(product=prod, place=place, date=date_at)
if len(stock) > 0:
stock_values[key] = stock[0].quantity
else:
try:
stock = stocks.filter(product=prod, place=place).order_by('-date')[0]
except IndexError:
stock_values[key] = 0
else:
stock_values[key] = stock.quantity
return stock_values
How would you make it faster?
Edit:
Rewrote the code as this:
stock_values = {}
for product in self.products:
for place in self.places:
try:
stock_value = Stock.objects.filter(product=product, place=place, date__lte=date_at)\
.order_by('-date').values('cant')[0]['cant']
except IndexError:
stock_value = 0
stock_values[u'%s%s' % (product.id, place.id)] = stock_value
return stock_values
It works better (from 256 secs to 64) but still need to improve it. Maybe some custom SQL, I don’t know…
Arthur’s right, the
len(stock)isn’t the most efficient way to do that. You could go further along the “easier to ask for forgiveness than permission” route with something like this inside the inner loop:I’m not sure how much that would improve it compared to just changing the length check, though I think this should at least restrict it to two queries with
LIMIT 1on them (see Limiting QuerySets).Mind you, this is still performing a lot of database hits since you could run through that loop almost 50000 times. Optimize how you’re looping and you’re in a better position still.