I’ve been working on a system that transmits xml over sockets for a while now. And i never really understood whats the real advantage of the choice of xml over sockets instead of a custom protocol.
But i do see a lot of developers(specially originally web-developers) setting up this sort of implementation(xml over sockets).
I do understand that is more “human-readable” (that’s what i keep hearing).
But,
-
Xml carries an awful amount of characters, leading to huge messages when in fact the content is really small and simple.
-
Message size varies, therefore you need to guarantee that you terminate your message with specific character or string pattern.
-
There is more overhead when parsing xml
For all this reasons i remain skeptic about considering XML over Sockets for new systems when i could set my systems using a custom protocol using fixed-size messages. Avoiding huge messages being transmitted and performance hits parsing xml on the client size.
Am i wrong to think as such ? What’s “best” in terms of system-architecture ?
Regards
Design decisions are all about trade-offs. You have enumerated what XML gives you – readability and self-description. It also comes with a description language (XSD), is extremely portable etc etc.
But these advantages come with disadvantages that you mention. So let’s tackle them one by one:
Verbosity
Yes, XML is verbose, being both self-descriptive and text-based. This is only really a problem if performance is a concern. Is it? What about the positive tradeoffs?
Note that a reasonable alternative here is JSON, which is just as readable but far more efficient.
Varying Size
Yes, but this depends on the connectivity layer. If you do not have a persistent connection (for example, HTTP), or you are using a protocol that provides it’s own ‘framing’ (such as AMQP or JMS) then this is not an issue – the transport layer takes care of it. If you are planning to reinvent this wheel, yes varying payloads makes the protocol harder. But the protocol (especially with all edge cases) is hard enough.
Parsing Overhead
This is related directly to verbosity.