Currently, we are uploading all of our user-generated-content to a medium-size EC2 Instance, and then from there we run a cron job to sync all of the uploaded content to S3. We have some code that runs on the backend (every time you need to access any uploaded file) that checks to see whether or not the resource has been moved to S3, or if it is just available on our uploads instance.
This seems a little wasteful, but it does provide redundency — if S3 is down, we have some javascript code in place that forces the files to be served from our upload box. The actual file uploads are stored in EBS, not on the instance.
We’ve got about 150GB worth of files in the S3 bucket right now; which makes performing a separate backup of the S3 Bucket extremely time consuming and nearly impossible to run on any sort of regular basis.
So, my question is, is this even necessary? Can anyone point me to some uptime statistics between S3 and EC2? Does it ever happen that S3 is down, but EC2 is available? It seems like it might be simpler to just upload everything directly to S3 and trust that it is up…. On the other hand, we could just store everything in EBS and forget S3 completely, which seems like it makes more sense.
It’s much more likely that your EC2 instance will be down than S3 will be down. For one, you have a single instance running on a single host with a single network connection in a single availability zone. Past that, on a platform level, EC2 (particularly involving EBS) has had several protracted outages, whereas S3 has not had a significant availability event since 2008.
S3 is a distributed system spread all across your region of choice. Operating at the object level with eventual consistency guarantees is frankly a lot simpler than the problems addressed by EBS and EC2, all of which add additional consistency guarantees (and thus ways to fail) by design.
I generally make upload processes treat S3 as a backing store — upload to S3 directly, or upload via an EC2 instance in a write-through fashion — and accept that if S3 is down, then I can’t handle uploads. Doing it this way introduces a failure mode where your app is running but S3 is not, but it significantly reduces the potential for data loss, which is usually a more serious problem than unavailability. This also allows you to simultaneously handle uploads via different EC2 instances in different availability zones, hedging against EC2 failures, as well as via instance-store instances, hedging against EBS failures.