I am having an issue with recv. I wrote a function that fills a structure with data, and the length (in bytes) of that data which is read from a socket.
For testing I am just printing the data to stdout byte by byte based on the total amount of bytes read by recv. For some reason the number of bytes being read seems to be correct sometimes and incorrect other times depending on what site I am querying. For example the following code works as intended on some sites:
data->data_sz = 0;
while((i = recv(sock, data->data + data->data_sz, CHUNKSIZE, 0)) > 0)
{
data->data_sz += i;
if(databff - data->data_sz < CHUNKSIZE)
{
databff *= 2;
if(!(tmp = realloc(data->data, databff)))
{
free(data->data);
(void) WSACleanup();
return 0;
}
data->data = tmp;
}
}
i = strsbstr(data->data, "\r\n\r\n") + 4; //i = the position of the first char after header info
if(i >= 0)
{
data->data_sz = data->data_sz - i; //data->data_sz = number of bytes without header info
memmove(data->data, data->data + i, data->data_sz);
if(!(tmp = realloc(data->data, data->data_sz)))
{
free(data->data);
(void)WSACleanup();
return 0;
}
data->data = tmp;
}
else
{
free(data->data);
(void) WSACleanup();
return 0;
}
return 1;
}
To print the data to stdout I just use a for loop:
//t_html->data_sz points to my data->data_sz structure
//t_html->data points to my data->data structure
for(i = 0; i <= t_html->data_sz; i++) (void)fputc((int)t_html->data[i], stdout);
The above code works for some sites but fails on others (for example when querying http://www.google.com I expect the final characters to be </html> but I get </html>l).
Basically my problem is that data->data_sz (the amount of bytes received) is not being calculated correctly, which makes it impossible to correctly use gathered data. I am really at a loss of what to do right now.
EDIT:
here is the strsbstr function which is called in the above code:
int strsbstr(const char *str, const char *sbstr)
{
char *sbstrlc;
if(!(strcmp(str, sbstr))) return 0;
if(!(sbstrlc = strstr(str, sbstr))) return -1;
return (int) (sbstrlc - str);
}
recv(sock, data->data + data->data_sz, CHUNKSIZE, 0)is potentially a problem. Why? Because you may not haveCHUNKSIZEroom left in your buffer. You havedatabff - data->data_szleft, actually (assumingdatais allocated to a size ofdatabff). It all depends on the initial values ofdatabffandCHUNKSIZE, which I can’t see and figure I’d point this out just in case.i <= t_html->data_sz;which is wrong. It should bei < t_html->data_sz;. If you use<=, you’re accessing one past your buffer, which is likely why you get a weird character sometimes, and sometimes not.