I work on a project using libpcap for capturing IP packets. libpcap returns captured data in a buffer, with an unsigned char * pointer and a buffer length. The data in the buffer is not null-terminated.
I do process the buffer data with library functions, e.g. string functions from the C standard library. These functions expect (signed) char * pointers, requiring casting the data between unsigned char * and char *.
I like the idea of assuming an unsigned char * buffer as not-null-terminated (accompanied by a buffer length) with potentially non-printable characters, as opposed to a char * buffer which holds a printable string literal. However, that forces me to cast the libpcap buffer for each string function call which makes the code ugly.
What would be your coding style preference in this case?
-
Keep the
unsigned char *and cast when calling string functions. -
Cast the libpcap buffer to
char *immediatelly after receiving it from libpcap and differ between raw data and strings via variable naming conventions in the upstream code.
If you know that you are at a protocol level where there is supposed to be text,
use the second approach, just keep a char* around and use that where needed. There’s no reason to cast it to a char* everywhere.
However, be very, very, very careful about which string handling functions you use. You are capturing stuff off th wire, you could be getting anything. i.e. you have to respect the total length of the pcap supplied buffer everywhere – functions such as strlen, strcpy, etc. cannot be used unless you safely alter and nul terminate the buffer.
(and you really have to make sanity checks, if e.g.you’re parsing the length of an UDP packet and the length says 130 bytes, doesn’t mean there actually is 130 bytes you can safely access)
You also have to verify that what you’re parsing actually is text, you should not e.g. just print out a chunk of the payload assuming it is text.