I have a Django query and some Python code that I’m trying to optimize because 1) it’s ugly and it’s not as performant as some SQL I could use to write it, and 2) because the hierarchical regrouping of the data looks messy to me.
So,
1. Is it possible to improve this to be a single query?
2. How can I improve my Python code to be more Pythonic?
Background
This is for a photo gallery system. The particular view is attempting to display the thumbnails for all photos in a gallery. Each photo is statically sized several times to avoid dynamic resizing, and I would like to also retrieve the URLs and “Size Type” (e.g. Thumbnail, Medium, Large) of each sizing so that I can Lightbox the alternate sizes without hitting the database again.
Entities
I have 5 models that are of relevance:
class Gallery(models.Model):
Photos = models.ManyToManyField('Photo', through = 'GalleryPhoto', blank = True, null = True)
class GalleryPhoto(models.Model):
Gallery = models.ForeignKey('Gallery')
Photo = models.ForeignKey('Photo')
Order = models.PositiveIntegerField(default = 1)
class Photo(models.Model):
GUID = models.CharField(max_length = 32)
class PhotoSize(models.Model):
Photo = models.ForeignKey('Photo')
PhotoSizing = models.ForeignKey('PhotoSizing')
PhotoURL = models.CharField(max_length = 1000)
class PhotoSizing(models.Model):
SizeName = models.CharField(max_length = 20)
Width = models.IntegerField(default = 0, null = True, blank = True)
Height = models.IntegerField(default = 0, null = True, blank = True)
Type = models.CharField(max_length = 10, null = True, blank = True)
So, the rough idea is that I would like to get all Photos in a Gallery through GalleryPhoto, and for each Photo, I want to get all the PhotoSizes, and I would like to be able to loop through and access all this data through a dictionary.
A rough sketch of the SQL might look like this:
Select PhotoSize.PhotoURL
From PhotoSize
Inner Join Photo On Photo.id = PhotoSize.Photo_id
Inner Join GalleryPhoto On GalleryPhoto.Photo_id = Photo.id
Inner Join Gallery On Gallery.id = GalleryPhoto.Gallery_id
Where Gallery.id = 5
Order By GalleryPhoto.Order Asc
I would like to turn this into a list that has a schema like this:
(
photo: {
'guid': 'abcdefg',
'sizes': {
'Thumbnail': 'http://mysite/image1_thumb.jpg',
'Large': 'http://mysite/image1_full.jpg',
more sizes...
}
},
more photos...
)
I currently have the following Python code (it doesn’t exactly mimic the schema above, but it’ll do for an example).
gallery_photos = [(photo.Photo_id, photo.Order) for photo in GalleryPhoto.objects.filter(Gallery = gallery)]
photo_list = list(PhotoSize.objects.select_related('Photo', 'PhotoSizing').filter(Photo__id__in=[gallery_photo[0] for gallery_photo in gallery_photos]))
photos = {}
for photo in photo_list:
order = 1
for gallery_photo in gallery_photos:
if gallery_photo[0] == photo.Photo.id:
order = gallery_photo[1] //this gets the order column value
guid = photo.Photo.GUID
if not guid in photos:
photos[guid] = { 'Photo': photo.Photo, 'Thumbnail': None, 'Sizes': [], 'Order': order }
photos[guid]['Sizes'].append(photo)
sorted_photos = sorted(photos.values(), key=operator.itemgetter('Order'))
The Actual Question, Part 1
So, my question is first of all whether I can do my many-to-many query better so that I don’t have to do the double query for both gallery_photos and photo_list.
The Actual Question, Part 2
I look at this code and I’m not too thrilled with the way it looks. I sure hope there’s a better way to group up a hierarchical queryset result by a column name into a dictionary. Is there?
When you have sql query, that is hard to write using orm – you can use postgresql views. Not sure about mysql. In this case you will have:
Raw SQL like:
Django model like:
ORM Queryset like:
Hope it will help.