In educational purposes I’m writing a HTTP server in C++.
When receiving a request, how do I know when the client has finished sending headers? Is there an obligation that all headers must be sent in one shot? What if a client sends G, then after 5 seconds E, then T..? Should I wait a timeout and just close the connection if it takes too long? Should I start parsing as soon as I get the first bytes to know if the request is invalid?
I know there are a lot of libraries for this, I’m just reinventing the wheel to better understand how the Web works at different layers. And I can’t find how they deal with exactly my question.
There are two parts to this answer.
Firstly, the issue of delay and time-out: you should deal with timeouts indeed, as it’s generally not possibly to detect whether a TCP connection is broken. There is more on this topic in this question: TCP socket in Unix – notify server I am done sending
Secondly, the format of an HTTP request is defined (in RFC 2616, section 5) as follows:
Essentially, you get the request line (for example
GET /index.html HTTP/1.1), followed by multiple header lines (without empty lines). Then, the list of headers ends with an empty line. All ends of lines are represented with CRLF (“\r\n“).In addition to this, some requests also have a body (typically those using
POSTorPUT). If the request has a message body, its length will be given either by theContent-Lengthheader or using delimiters via chunked transfer encoding.