Let me start off by saying that I am not an expert in C. I have been reviewing the code of a JSON parser.
I am trying to understand this piece of code.
/* Render the cstring provided to an escaped version that can be printed. */
static char *print_string_ptr(const char *str)
{
const char *ptr;
char *ptr2,*out;
int len=0;
unsigned char token;
if (!str)
return cJSON_strdup("");
ptr = str;
while ((token = *ptr) && ++len) {
if (strchr("\"\\\b\f\n\r\t", token))
len++;
else if (token < 32)
len += 5;
ptr++;
}
out = (char*)cJSON_malloc(len + 3);
if (!out)
return 0;
ptr2 = out;
ptr = str;
*ptr2++ = '\"';
while (*ptr) {
if ((unsigned char)*ptr > 31 && *ptr != '\"' && *ptr != '\\')
*ptr2++ = *ptr++;
else {
*ptr2++ = '\\';
switch (token = *ptr++) {
case '\\': *ptr2++='\\'; break;
case '\"': *ptr2++='\"'; break;
case '\b': *ptr2++='b'; break;
case '\f': *ptr2++='f'; break;
case '\n': *ptr2++='n'; break;
case '\r': *ptr2++='r'; break;
case '\t': *ptr2++='t'; break;
default:
/* escape and print */
sprintf(ptr2, "u%04x", token);
ptr2 += 5;
break;
}
}
}
*ptr2++ = '\"';
*ptr2++ = 0;
return out;
}
A really general summary of how this code actually works would be really great, my impression has been that it is “beautifying” the JSON string, is that correct?
At first glance it appears to be replacing \r with r, but what would the point of this be?
I have been researching the functionality of sprintf, but for simple things such as printing out currency values or other formatting issues. But I haven’t got a clue what the sprintf function is doing here:
sprintf(ptr2,"u%04x",token);ptr2+=5;
And what is the purpose of the ptr2+=5 ?
Any insight into this would really be helpful.
What it’s doing is turning control characters into the escape sequences you’d normally use in C source code.
This is basically saying “if we have a normal character like a letter, digit, etc., just copy it directly from input to output.”
Otherwise, we’re going to produce an escape sequence in the output, which will start with a backslash.
Then, depending on which control character it finds, it generates the second character of the escape sequence, so an actual ‘backspace’ character in the input (which will compare equal to ‘\b’) will produce the two characters `\’ and ‘b’ in the output.
and the same for form-feed, new-line, carriage return and tab.
Otherwise, render the control character in hexadecimal, so it becomes something like
\1234.