I am currently evaluating Protocol Buffers for use in a project (no code written as of yet). One of the things I’m unclear on is how you would read part of an encoded message, for example say I have a common header:
message Header {
required uint16 msg_type = 1;
required uint16 length = 2;
}
And say I deliver multiple different messages to a queue. How would the consumer work out how much data to read per message and what message type is should be constructed as?
There should be no need for a
Headermessage here; the most common approach is to follow the “streaming” advice from here. Within that, you could either treat it as a sequence of identical union type messages, or (my preference) when writing, instead of just writing a length-prefix before each, include a varint that indicates the message type then the length (as a varint). The number that indicates the message type is some arbitrary map you invent, so 1 = Foo, 2 = Bar, 3 = Blap, etc). If you left-shift the message-type by 3 bits then “or” 2, then it will also be a well-formed protobuf stream itself, 100% identical to arepeated YourUnionType.Basically, this is exactly the same as this answer, but instead of being field 1 each time, the number varies per message-type. Most implementations have a reader/writer API that make it possible to read and write raw varints, and to length-restrict the reader API. Some implementations have helper mechanisms to support streams of heterogeneous messages directly (basically, doing all the above for you).