Amazon Integration
I have my own CMS which has a file manager. A lot of the files and formats which people can create are stored locally in a database. These are trivial examples like CSS files, basic content etc.
The file manager can do all the things thats docs.google.com does. I actually based the entire methodolgy and design around the google docs browser.
Now, I am adding Amazon S3, so that my file manager will also display files uploaded to Amazon S3.
I have a few logistical questions.
All of my files and the heirarchical structure is stored in my assets and folders table in my mysql database. If I add Amazon S3, files will be uploaded to Amazon and I want to know how I should integrate them.
I can do one of two things.
1. Going to Amazon every time
Either: Whenever the user browsers any particular folder my script can also go off to Amazon and do something like:
$s3->listObjects();
Then I can merge the results of my database query with the results. I could even cache to prevent some issues with performance.
2. Going to my database locally every time.
Alternatively, since I am following this structure for uploads:
Client -> Server -> Amazon I need to process the files. This means that I can store a lot of the details in my database. There would be very little need to goto Amazon to list the structure because I can look locally.
What do you think is the best option?
I think the second option.
This has a few benefits.
Database Benefits
- I am not querying Amazon constantly. (Cheaper as a result as I think you have to pay for the API usage per 1000 requests).
- It will be faster
- I do not have to merge the structure
Database Cons
- I need to make sure that my database version is an exact copy always of Amazon. Could be difficult??
- I need to create a
syncronisescript. This shouldn’t be too hard?
I have a fair bit of experience using Amazon S3 for file storage for a website and you’ll definitely want to go the database route.
S3 is way to slow to query all the time and as you mentioned you’ll have the additional costs(albeit small). The speed becomes even more and more of an issue, the more files you have stored in a bucket as listObjects() only returns 1000 at a time. The performance issues are easy to see simply by using any of the S3 tools(eg Bucket Explorer, Cloudberry, or even Amazons own tools) to browse a bucket with lots of files.
The extra effort required to ensure your database stays in sync with S3 is well worth it.