I have a protocol buffer setup like this: [ProtoContract] Foo { [ProtoMember(1)] Bar[] Bars;

Question

0

Asked: May 20, 20262026-05-20T10:11:59+00:00 2026-05-20T10:11:59+00:00

I have a protocol buffer setup like this: [ProtoContract] Foo { [ProtoMember(1)] Bar[] Bars;

0

I have a protocol buffer setup like this:

[ProtoContract]
Foo
{
    [ProtoMember(1)]
    Bar[] Bars;
}

A single Bar gets encoded to a 67 byte protocol buffer. This sounds about right because I know that a Bar is pretty much just a 64 byte array, and then there are 3 bytes overhead for length prefixing.

However, when I encode a Foo with an array of 20 Bars it takes 1362 bytes. 20 * 67 is 1340, so there are 22 bytes of overhead just for encoding an array!

Why does this take up so much space? And is there anything I can do to reduce it?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-20T10:12:00+00:00

This overhead is quite simply the information it needs to know where each of the 20 objects starts and ends. There is nothing I can do different here without breaking the format (i.e. doing something contrary to the spec).

If you really want the gory details:

An array or list is (if we exclude “packed”, which doesn’t apply here) simply a repeated block of sub-messages. There are two layouts available for sub-messages; strings and groups. With a string, the layout is:

[header][length][data]

where header is the varint-encoded mash of the wire-type and field-number (hex 08 in this case with field 1), length is the varint-encoded size of data, and data is the sub-object itself. For small objects (data less than 128 bytes) this often means 2 bytes overhead per object, depending on a: the field number (fields above 15 take more space), and b: the size of the data.

With a group, the layout is:

[header][data][footer]

where header is the varint-encoded mash of the wire-type and field-number (hex 0B in this case with field 1), data is the sub-object, and footer is another varint mash to indicate the end of the object (hex 0C in this case with field 1).

Groups are less favored generally, but they have the advantage that they don’t incur any overhead as data grows in size. For small field-numbers (less than 16) again the overhead is 2 bytes per object. Of course, you pay double for large field-numbers, instead.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a protocol buffer setup like this: [ProtoContract] Foo { [ProtoMember(1)] Bar[] Bars;

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply