As part of my protobuf protocol I require the ability to send data of a dynamic type, a little bit like VARIANT. Roughly I require the data to be an integer, string, boolean or “other” where “other” (e.g. DateTime) is serialized as a string. I need to be able to use these as a single field and in lists in a number of different locations in the protocol.
How can this best be implemented while keeping message size minimal and performance optimal?
I’m using protobuf-net with C#.
EDIT:
I’ve posted a proposed answer below which uses what I think is the minimum of memory required.
EDIT2:
Created a github.com project at http://github.com/pvginkel/ProtoVariant with a complete implementation.
Jon’s multiple optionals covers the simplest setup, especially if you need cross-platform support. On the .NET side (to ensure you don’t serialize unnecessary values), simply return
nullfrom any property that isn’t a match, for example:You can also do the same using the
bool ShouldSerialize*()pattern if you don’t like the nulls.Wrap that up in a
classand you should be fine to use that at either the field level or list level. You mention optimal performance; the only additional thing I can suggest there is to perhaps consider treating as a “group” rather than “submessage”, as this is easier to encode (and just as easy to decode, as long as you expect the data). To do that, use theGroupeddata-format, via[ProtoMember], i.e.However, the difference here can be minimal – but it avoids some back-tracking in the output stream to fix the lengths. Either way, in terms of overheads a “submessage” will take at least 2 bytes; “at least one” for the field-header (perhaps taking more if the
12is actually1234567) – and “at least one” for the length, which gets bigger for longer messages. A group takes 2 x the field-header, so if you use low field-numbers this will be 2 bytes regardless of the length of the encapsulated data (it could be 5MB of binary).A separate trick, useful for more complex scenarios but not as interoperable, is generic inheritance, i.e. an abstract base class that has
ConcreteType<int>,ConcreteType<string>etc listed as subtypes – this, however, takes an extra 2 bytes (typically), so is not as frugal.Taking another step further away from the core spec, if you genuinely can’t tell what types you need to support, and don’t need interoperability – there is some support for including (optimized) type information in the data; see the
DynamicTypeoption onProtoMember– this takes more space than the other two options.