I want to develop a text protocol based on XML and transmitted via TCP/IP sockets.
Let’s say I have a simple request/response mechanism to be send over a persistent
TCP/IP connection between client and server like this:
<?xml version="1.0" encoding="UTF-8"?>
<request id="1" command="get.answer">
<value type="string">Answer to the Ultimate Question of Life, the Universe, and Everything</value>
</request>
<?xml version="1.0" encoding="UTF-8"?>
<response id="1" command="get.answer">
<value type="int32">42</value>
</response>
When should each side start to process the incoming data or in other words
when would the server know that the incoming client data is fully transfered
and possible to process to create a response?
Of course I made some research about that topic:
I found this answer which points in the right direction based on an HTTP example:
So using a kind of ‘Transfer Protocol’ on top of the XML messages would certainly help.
But I also looked at the purely XML-based XMPP protocol which doesn’t use any
‘Transfer Protocol’ like HTTP at least as far as I have seen.
From RFC 6120 at “2.4. Structured Data” it reads:
The basic protocol data unit in XMPP is not an XML stream (which
simply provides the transport for point-to-point communication) but
an XML “stanza”, which is essentially a fragment of XML that is sent
over a stream. The root element of a stanza includes routing
attributes (such as “from” and “to” addresses), and the child
elements of the stanza contain a payload for delivery to the intended
recipient.
So they send basically small XML chunks over TCP/IP w/o ‘Transfer Protocol’ and from
my wireshark traces I can see that there is also no special End-Of-Transmission character
at the end of each XML stanza like two times \r\n or something like that.
So how do they know about the end of a message (stanza)?
Actually, XMPP uses an XML stream to transfer data. The data unit you are referring to is the actual exchange of individual messages, but they are all contained within an XML stream that define the start and endpoint of the communication for an XMPP session.
This would be where the End Of Transmission occurs, as in end of all transmission. Within that stream, there are 3 defined packet types (IQ, Message and Presence) which would indicate the start and end of individual messages (for client to server comms).
Although the basic case is done over a TCP connection, there are extensions to support different wireline protocols as well, such as HTTP which is useful for allowing XMPP through a firewall.
If you want to do something similar, then you can follow the same approach, which is to start and end you XML stream when your connection is established and dropped. Then you simply need to define the individual message types, so your endpoints will know what constitutes a complete message.
Or you could just use XMPP which seems to fit your use case perfectly.