I can appreciate that seeing “basic auth” and “safe enough” in the same sentence is a lot like reading “Is parachuting without a parachute still safe?”, so I’ll do my best to clarify what I am getting at.
From what I’ve seen online, people typically describe basic HTTP auth as being unsecured due to the credentials being passed in plain text from the client to the server; this leaves you open to having your credentials sniffed by a nefarious person or man-in-the-middle in a network configuration where your traffic may be passing through an untrusted point of access (e.g. an open AP at a coffee shop).
To keep the conversation between you and the server secure, the solution is to typically use an SSL-based connection, where your credentials might be sent in plain text, but the communication channel between you and the server is itself secured.
So, onto my question…
In the situation of replicating one CouchDB instance from an EC2 instance in one region (e.g. us-west) to another CouchDB instance in another region (e.g. singapore) the network traffic will be traveling across a path of what I would consider “trusted” backbone servers.
Given that (assuming I am not replicating highly sensitive data) would anyone/everyone consider basic HTTP auth for CouchDB replication sufficiently secure?
If not, please clarify what scenarios I am missing here that would make this setup unacceptable. I do understand for sensitive data this is not appropriate, I just want to better understand the ins and outs for non-sensitive data replicated over a relatively-trusted network.
Bob is right, it is better to err on the side of caution, but I disagree. Bob could be right in this case (see details below), but the problem with his general approach is that it ignores the cost of paranoia. It leaves “peace dividend” money on the table. I prefer Bruce Schneier’s assessment that it is a trade-off.
Short answer
Start replicating now! Do not worry about HTTPS.
The greatest risk is not wire sniffing, but your own human error, followed by software bugs, which could destroy or corrupt your data. Make a replica!. If you will replicate regularly, plan to move to HTTPS or something equivalent (SSH tunnel, stunnel, VPN).
Rationale
Is HTTPS is easy with CouchDB 1.1? It is as easy as HTTPS can possibly be, or in other words, no, it is not easy.
You have to make an SSL key pair, purchase a certificate or run your own certificate authority—you’re not foolish enough to self-sign, of course! The user’s hashed password is plainly visible from your remote couch! To protect against cracking, will you implement bi-directional SSL authentication? Does CouchDB support that? Maybe you need a VPN instead? What about the security of your key files? Don’t check them into Subversion! And don’t bundle them into your EC2 AMI! That defeats the purpose. You have to keep them separate and safe. When you deploy or restore from backup, copy them manually. Also, password-protect them so if somebody gets the files, they can’t steal (or worse, modify!) your data. When you start CouchDB or replicate, you must manually input the password before replication will work.
In a nutshell, every security decision has a cost.
A similar question is, “should I lock my house at night? It depends. Your profile says you are in Tuscon, so you know that some neighborhoods are safe, while others are not. Yes, it is always safer to always lock all of your doors all of the time. But what is the cost to your time and mental health? The analogy breaks down a bit because time invested in worst-case security preparedness is much greater than twisting a bolt lock.
Amazon EC2 is a moderately safe neighborhood. The major risks are opportunistic, broad-spectrum scans for common errors. Basically, organized crime is scanning for common SSH accounts and web apps like WordPress, so they can a credit card or other database.
You are a small fish in a gigantic ocean. Nobody cares about you specifically. Unless you are specifically targeted by a government or organized crime, or somebody with resources and motivation (hey, it’s CouchDB—that happens!), then it’s inefficient to worry about the boogeyman. Your adversaries are casting broad nets to get the biggest catch. Nobody is trying to spear-fish you.
I look at it like high-school integral calculus: measuring the area under the curve. Time goes to the right (x-axis). Risky behavior goes up (y-axis). When you do something risky you saved time and effort, but the the graph spikes upward. When you do something the safe way, it costs time and effort, but the graph moves down. Your goal is to minimize the long-term area under the curve, but each decision is case-by-case. Every day, most Americans ride in automobiles: the single most risky behavior in American life. We intuitively understand the risk-benefit trade-off. Activity on the Internet is the same.