This is a very general question. I am a bit confused with the term state. I would like to know what do people mean by “state of an application”? Why do they call webserver as “stateless” and database as “stateful”?
How is the state of an application (in a VM) transferred, when the VM memory is moved from one machine to another during live migration.
Is transferring the memory, caches and register values of a system enough to transfer the state of the running application?
You’ve definitely asked a mouthful — it’s unfortunate that the word state is used in so many different contexts, but each one is a valid use of the word.
State of an application
An application’s state is roughly the entire contents of its memory. This can be a difficult concept to get behind until you’ve seen something like Erlang’s server loops, which explicitly pass all the state of the application in a variable from one invocation of the function to the next. In more “normal” programming languages, the “state” of the program is all its global variables, static variables, objects allocated on the heap, objects allocated on the stack, registers, open file descriptors and file offsets, open network sockets and associated kernel buffers, and so forth.
You can actually save that state and resume execution of the process elsewhere. The BLCR checkpoint tools for Linux do exactly this. (Though it is an extremely uncommon task to perform.)
State of a protocol
The state of a protocol is a different sort of meaning — the statelessness of HTTP requests means that every web browser communication with webservers essentially starts over, from scratch — every cookie is re-transmitted in both directions to try to “fake” some amount of a “session” for the user’s sake. The servers don’t hold any resources open for any given client across requests — each one starts from scratch.
Networked filesystems might also be stateless (earlier versions of NFS) or stateful (newer versions of NFS). The earlier versions assumed every individual packet of reading, writing, or metadata control would be committed as it arrived, and every time a specific byte was needed from a file, it would be re-requested. This allowed the servers to be very simple — they would do what the client packets told them to do and no effort was required to bring servers and clients back to consistency if a server rebooted or routers disappeared. However, this was bad for performance — every client requested static data hundreds or thousands of times each day. So newer versions of NFS allowed some amount of data caching on the clients, and persistent file handles between servers and clients, and the servers had to keep track of the state of the clients that were connected — and vice versa: the clients also had to know what promises they had made to the servers.
A stateful firewall will keep track of active TCP sessions. It knows which sessions the system administrators want to allow through, so it looks for those initial packets specifically. Once the session is set up, it then tracks the established connections as entities in their own rights. (This was a real advancement upon previous stateless firewalls which considered packets in isolation — the rulesets on previous firewalls were much more permissive to achieve the same levels of functionality, but allowed through far too many malicious packets that pretended a session was already active.)