I saw a function on this site a while ago, that I took and adapted a bit for my use.
It’s a function that uses getc and stdin to retrieve a string and allocate precisely as much memory as it needs to contain the string. It then just returns a pointer to the allocated memory which is filled with said string.
My question is are there any downsides (besides having to manually free the allocated memory later) to this function? What would you do to improve it?
char *getstr(void)
{
char *str = NULL, *tmp = NULL;
int ch = -1, sz = 0, pt = 0;
while(ch)
{
ch = getc(stdin);
if (ch == EOF || ch == 0x0A || ch == 0x0D) ch = 0;
if (sz <= pt)
{
sz++;
tmp = realloc(str, sz * sizeof(char));
if(!tmp) return NULL;
str = tmp;
}
str[pt++] = ch;
}
return str;
}
After using your suggestions here is my updated code, I decided to just use 256 bytes for the buffer since this function is being used for user input.
char *getstr(void)
{
char *str, *tmp = NULL;
int ch = -1, bff = 256, pt = 0;
str = malloc(bff);
if(!str)
{
printf(\nError! Memory allocation failed!");
return 0x00;
}
while(ch)
{
ch = getc(stdin);
if (ch == EOF || ch == '\n' || ch == '\r') ch = 0;
if (bff <= pt)
{
bff += 256;
tmp = realloc(str, bff);
if(!tmp)
{
free(str);
printf("\nError! Memory allocation failed!");
return 0x00;
}
str = tmp;
}
str[pt++] = ch;
}
tmp = realloc(str, pt);
if(!tmp)
{
free(str);
printf("\nError! Memory allocation failed!");
return 0x00;
}
str = tmp;
return str;
}
It’s excessively frugal IMO, and makes the mistake of sacrificing performance in order to save infinitesmal amounts of memory, which is pointless in most settings, I think. Allocation calls like realloc are potentially laborous for the system, and here it is done for every byte.
It would be better to just have a local buffer, say 4KB, to read into, then allocate the return string based on the length of what is actually read into that. Keep in mind that the stack* on a normal system is 4-8MB anyway, whether you use it all or not. If the string read turns out to be longer than 4KB, you could write a similar loop that allocates and copies into the return string. So a similar idea, but heap allocation would occur every 4096 bytes rather than every byte, so, eg, you have the initial buffer of 4096, when that is exhausted you malloc 4096 for the return string and copy in, continue reading into the buffer (from the beginning), and if another 1000 bytes is read you realloc to 5097 and return that.
I think it is a common mistake of beginners to get obsessed with minimizing heap allocation by approaching it byte by byte. Even KB by KB is a little small; the system allocates in pages (4 KB) and you might as well align yourself with that.
*the memory provided for local storage inside a function.