I have the following model
class Plugin(models.Model):
name = models.CharField(max_length=50)
# more fields
which represents a plugin that can be downloaded from my site. To track downloads, I have
class Download(models.Model):
plugin = models.ForiegnKey(Plugin)
timestamp = models.DateTimeField(auto_now=True)
So to build a view showing plugins sorted by downloads, I have the following query:
# pbd is plugins by download - commented here to prevent scrolling
pbd = Plugin.objects.annotate(dl_total=Count('download')).order_by('-dl_total')
Which works, but is very slow. With only 1,000 plugins, the avg. response is 3.6 – 3.9 seconds (devserver with local PostgreSQL db), where a similar view with a much simpler query (sorting by plugin release date) takes 160 ms or so.
I’m looking for suggestions on how to optimize this query. I’d really prefer that the query return Plugin objects (as opposed to using values) since I’m sharing the same template for the other views (Plugins by rating, Plugins by release date, etc.), so the template is expecting Plugin objects – plus I’m not sure how I would get things like the absolute_url without a reference to the plugin object.
Or, is my whole approach doomed to failure? Is there a better way to track downloads? I ultimately want to provide users some nice download statistics for the plugins they’ve uploaded – like downloads per day/week/month. Will I have to calculate and cache Downloads at some point?
EDIT: In my test dataset, there are somewhere between 10-20 Download instances per Plugin – in production I expect this number would be much higher for many of the plugins.
That does seem unusually slow. There’s nothing obvious in your query that would cause that slowness, though. I’ve done very similar queries in the past, with larger datasets, and they have executed in milliseconds.
The only suggestion I have for now is to install the Django debug toolbar, and in its SQL tab find the offending query and go to EXPLAIN to get the database to tell you exactly what it is doing when it executes. If it’s doing subqueries, for example, check that they are using an index – if not, you may need to define one manually in the db. If you like, post the result of EXPLAIN here and I’ll help further if possible.