I’ve been using fabric and boto to start up new ec2 hosts for some temporary processing but I’ve always had trouble knowing when I can connect to the host. The problem is that I can ask ec2 when something is ready but it’s never really ready.
This is the process that I’ve noticed works best (though it still sucks):
- Poll ec2 until it says that the host it “active”
- Poll ec2 until it has a
public_dns_name - Try to connect to the new host in a loop until it accepts the connection
But sometimes it accepts the connection seemingly before it knows about the ssh key pair that I’ve associated it with and then asks for a password.
Is there a better way to decide when I can start connecting to my ec2 hosts after they’ve started up? Has anyone written a library that does this nicely and efficiently?
I do the same for #1 and #2, but for #3 I have a code loop that attempts to make a simple TCP connection to the ssh port (22) with short timeouts and retry. When it finally succeeds, it waits five more seconds an then run the ssh command.
The timing and order in which sshd is started and the public ssh key is added to .ssh/authorized_keys may vary depending on the AMI you are running.
Note: I mildly recommend using the public IP address directly instead of the DNS name. The IP address is encoded in the DNS name, so there’s no benefit to adding DNS lookups into the process.