Whenever I look at real code or example socket code in books, man pages and websites, I almost always see something like:
struct sockaddr_in foo;
memset(&foo, 0, sizeof foo);
/* or bzero(), which POSIX marks as LEGACY, and is not in standard C */
foo.sin_port = htons(42);
instead of:
struct sockaddr_in foo = { 0 };
/* if at least one member is initialized, all others are set to
zero (as though they had static storage duration) as per
ISO/IEC 9899:1999 6.7.8 Initialization */
foo.sin_port = htons(42);
or:
struct sockaddr_in foo = { .sin_port = htons(42) }; /* New in C99 */
or:
static struct sockaddr_in foo;
/* static storage duration will also behave as if
all members are explicitly assigned 0 */
foo.sin_port = htons(42);
The same can also be found for setting struct addrinfo hints to zero before passing it to getaddrinfo, for example.
Why is this? As far as I understand, the examples that do not use memset are likely to be the equivalent to the one that does, if not better. I realize that there are differences:
- memset will set all bits to zero, which is not necessarily the correct bit representation for setting each member to 0.
- memset will also set padding bits to zero.
Are either of these differences relevant or required behavior when setting these structs to zero and therefore using an initializer instead is wrong? If so, why, and which standard or other source verifies this?
If both are correct, why does memset/bzero tend to appear instead of an initializer? Is it just a matter of style? If so, that’s fine, I don’t think we need a subjective answer on which is better style.
The usual practice is to use an initializer in preference to memset precisely because all bits zero is not usually desired and instead we want the correct representation of zero for the type(s). Is the opposite true for these socket related structs?
In my research I found that POSIX only seems to require sockaddr_in6 (and not sockaddr_in) to be zeroed at http://www.opengroup.org/onlinepubs/000095399/basedefs/netinet/in.h.html but makes no mention of how it should be zeroed (memset or initializer?). I realise BSD sockets predate POSIX and it is not the only standard, so are their compatibility considerations for legacy systems or modern non-POSIX systems?
Personally, I prefer from a style (and perhaps good practice) point of view to use an initializer and avoid memset entirely, but I am reluctant because:
- Other source code and semi-canonical texts like UNIX Network Programming use bzero (eg. page 101 on 2nd ed. and page 124 in 3rd ed. (I own both)).
- I am well aware that they are not identical, for reasons stated above.
One problem with the partial initializers approach (that is ‘
{ 0 }‘) is that GCC will warn you that the initializer is incomplete (if the warning level is high enough; I usually use ‘-Wall‘ and often ‘-Wextra‘). With the designated initializer approach, that warning should not be given, but C99 is still not widely used – though these parts are fairly widely available, except, perhaps, in the world of Microsoft.I
tendused to favour an approach:Followed by:
The omission of the initializer in the static constant means everything is zero – but the compiler won’t witter (shouldn’t witter). The assignment uses the compiler’s innate memory copy which won’t be slower than a function call unless the compiler is seriously deficient.
GCC has changed over time
GCC versions 4.4.2 to 4.6.0 generate different warnings from GCC 4.7.1. Specifically, GCC 4.7.1 recognizes the
= { 0 }initializer as a ‘special case’ and doesn’t complain, whereas GCC 4.6.0 etc did complain.Consider file
init.c:When compiled with GCC 4.4.2 (on Mac OS X), the warnings are:
When compiled with GCC 4.5.1, the warnings are:
When compiled with GCC 4.6.0, the warnings are:
When compiled with GCC 4.7.1, the warnings are:
The compilers above were compiled by me. The Apple-provided compilers are nominally GCC 4.2.1 and Clang:
As noted by SecurityMatt in a comment below, the advantage of
memset()over copying a structure from memory is that the copy from memory is more expensive, requiring access to two memory locations (source and destination) instead of just one. By comparison, setting the values to zeroes doesn’t have to access the memory for source, and on modern systems, the memory is a bottleneck. So,memset()coding should be faster than copy for simple initializers (where the same value, normally all zero bytes, is being placed in the target memory). If the initializers are a complex mix of values (not all zero bytes), then the balance may be changed in favour of an initializer, for notational compactness and reliability if nothing else.There isn’t a single cut and dried answer…there probably never was, and there isn’t now. I still tend to use initializers, but
memset()is often a valid alternative.