I’m trying to design a REST API over HTTP. I am totally new to this, so please tell me if any of my assumptions or ideas are just plain wrong.
The domain is minimalistic. I have a database of Products and for each Product there is an associated Image. As I see it, I can design my API in one of two ways:
-
I can bundle each image with its product and represent them as one resource. The cons of this api would be that every time you PUT or GET a product, you have to send the image over the wire, even if you don’t specifically need to read or change the image. As of my understanding, it would not be RESTful to not PUT or GET a complete representation of a resource. Also, client-side caching of images would be of no use in this scenario.
-
I can model Products and Images as two different resources. When you GET a Product, it will contain an image_id which can be used to GET an Image. This model would require two HTTP Requests. One to GET the Product and one to GET its corresponding Image. Not so bad maybe, but what if I want to display a list of all Products along with their Images? Then I suddenly have a bunch of HTTP Requests. While using SSL, I guess this could create a performance issue. Good thing though, is that the consumer of my API could choose to cache images client-side.
So, how can I model my API to be both RESTful and efficient?
It’s good that you’re thinking about the data model.
Related to that, REST doesn’t specify or imply that the data model must be completely de-normalized.
Typically, when GETing a resource, you’d receive a packet of information that also includes URL references to other related resources, like a product image. It could also include a reference to a product category, a product manufacturer, and so on. Each might be URLs, or IDs that you could derive URLs from. A message like this:
…might imply URLs like this:
…and note that the full representation of the linked-to resources, like category, manufacturer and so on, is not transmitted with the original resource. This is a partially de-normalized data model.
In regard to your comments on PUT:
This is a matter of opinion, but… for many developers it’s completely acceptable to allow partial update via PUT. So you could update the resources without specifying everything; existing fields would remain unchanged. If you choose this behavior, it can complicate your (server-side) code when dealing with edge cases. For example, how does a client indicate that he wants to erase or delete a field? (Passing null may work, but for some data, null is a meaningful value.)
Why worry about PUT? If you want partial update, it’s easy to use POST, with a verb (eg, “partialUpdate”) in the query params. Actually this is what Roy Fielding advocates, and it makes sense to me.
A partial update would then be something like this: