If it possible to provide a service to multiple clients whereby if the server

Question

0

Asked: June 6, 20262026-06-06T15:45:09+00:00 2026-06-06T15:45:09+00:00

If it possible to provide a service to multiple clients whereby if the server

0

If it possible to provide a service to multiple clients whereby if the server providing this service goes down, another one takes it’s place- without some sort of centralised “control” which detects whether the main server has gone down and to redirect the clients to the new server?

Is it possible to do without having a centralised interface/gateway?

In other words, its a bit like asking can you design a node balancer without having a centralised control to direct clients?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-06T15:45:10+00:00

This answer is a general overview to high availability for networked applications, not specific to Erlang. I don’t know too much about what is available in the OTP framework yet because I am new to the language.

There are a few different problems here:

Client connection must be moved to the backup machine
The session may contain state data
How to detect a crash

Problem 1 – Moving client connection
This may be solved in many different ways and on different layers of the network architecture. The easiest thing is to code it right into the client, so that when a connection is lost it reconnects to another machine.

If you need network transparency you may use some technology to sync TCP states between different machines and then reroute all traffic to the new machine, which may be entirely invisible for the client. This is much harder to do than the first suggestion.

I’m sure there are lots of things to do in-between these two.

Problem 2 – State data
You obviously need to transfer the session state from the crashed machine unto the backup machine. This is really hard to do in a reliable way and you may lose the last few transactions because the crashed machine may not be able to send the last state before the crash. You can use a synchronized call in this way to be really sure about not losing state:

Transaction/message comes from the client into the main machine.
Main machine updates some state.
New state is sent to backup machine.
Backup machine confirms arrival of the new state.
Main machine confirms success to the client.

This may potentially be expensive (or at least not responsive enough) in some scenarios since you depend on the backup machine and the connection to it, including latency, before even confirming anything to the client. To make it perform better you can let the client check with the backup machine upon connection what transactions it received and then resend the lost ones, making it the client’s responsibility to queue the work.

Problem 3 – Detecting a crash
This is an interesting problem because a crash is not always well-defined. Did something really crash? Consider a network program that closes the connection between the client and server, but both are still up and connected to the network. Or worse, makes the client disconnect from the server without the server noticing. Here are some questions to think about:

Should the client connect to the backup machine?
What if the main server updates some state and send it to the backup machine while the backup have the real client connected – will there be a data race?
Can both the main and backup machine be up at the same time or do you need to shut down work on one of them and move all sessions?
Do you need some sort of authority on this matter, some protocol to decide which one is master and which one is slave? Who is that authority? How do you decentralise it?
What if your nodes loses their connection between them but both continue to work as expected (called network partitioning)?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

If it possible to provide a service to multiple clients whereby if the server

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply