So I’ve got this server application running on Google AppEngine that I want to migrate to node.js.
My main concern is building an API similar to the powerfull GAE’s blobstore/imageservice combinaison to host and serve the users uploaded pictures.
I’m really new to node.js and I’ve only read some documentations, viewed some video and played with node so far. I’d like to have some insight on what would be the best solutions to:
- host the pictures
- serve them through an efficient cache system
At the moment, I’m leaning toward redis to store and cache the pictures, I haven’t looked at all the node modules yet (there are quite a lot!).
What would be your architecture of choice for such an application?
Regarding the node.js infrastructure, you will probably want to use the express framework to build the application. I don’t know if you also require some image processing capabilities, but there is a node module to wrap the imagick library. It looks like an on-going development though – not sure it is already usable. Anyway, you absolutely need to avoid running any image processing code in the node single-threaded event loop.
For the upload part of your application, I would start by looking at node-formidable or this question.
Now for the storage itself, Redis is very efficient to store plenty of small objects, but not really designed to store large objects. Recent Redis versions are based on jemalloc which is a good general purpose memory allocator, but if you store large objects with Redis, it will generate some internal fragmentation. There is no memcached-like slab allocator in Redis.
So I would not store the images themselves in Redis, but only metadata and their associated indexes (file path, owner, size, tags, etc …). Images will be better stored directly on the filesystem IMO. There is an excellent node-redis module to access Redis from node.js
I’m not sure a cache is really required. If the images are stored on the filesystem, perhaps you can rely on the filesystem cache to avoid the I/Os. Node is quite good at delegating file operations to the libeio thread pool to avoid blocking the main event loop. I’m not really convinced caching image content with Redis would bring any benefit here. I would try the filesystem cache first, and investigate more complex caching only if needed.