I wrote a tokenizer for HTTP messages in Java. It has a method nextToken()

Question

0

Asked: May 28, 20262026-05-28T05:53:02+00:00 2026-05-28T05:53:02+00:00

I wrote a tokenizer for HTTP messages in Java. It has a method nextToken()

0

I wrote a tokenizer for HTTP messages in Java. It has a method nextToken() which supposed to return a string containing the whole HTTP message that was received. The problem is that the message ends before the expected body size has been read.

I read the input stream all the way to the beginning of the body. Then I try to read n bytes from the stream where n is the size in bytes of the body which is stated in the Content-Length header. The problem is that inside the while loop, the line charsRead = in.read(buffer) blocks because there is no more input in the input stream. But it happens before n bytes were read.

Example: In a body with size 12,493, it blocks when there are more 675 bytes expected to be read.

The input stream works with UTF-8 so every byte is encoded to one char.

/* Somewhere else in the code: 
InputStreamReader _isr =
     new InputStreamReader(clientSocket.getInputStream(), "UTF-8")
*/
BufferedReader in = new BufferedReader(_isr);
StringBuilder tmp = new StringBuilder();
String line = "";
boolean body = false;
int bodylen = -1;

for (;;) {
   line = in.readLine();

   if (line == null)
       break;
   if (line.equals("")) { /* We've reached the body */
       body = true;
       break;
   }

   tmp.append(line + "\r\n");

   if ((bodylen == -1) && (line.contains("Content-Length:"))) {
       /* Make `bodylen` hold the length of the body */
       String[] splitted = line.split("Content-Length:");
       bodylen = Integer.parseInt(splitted[1].trim());
   }
}

if (body == true) { 
    int charsRead;
    char[] buffer = new char[1024];

    while (bodylen > 0) {
        charsRead = in.read(buffer);
        if (charsRead == -1)
            break;
        bodylen -= charsRead;
        tmp.append(buffer);
    }
}

Why does it happen and how to solve it?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-28T05:53:03+00:00

Editorial Team

2026-05-28T05:53:03+00:00Added an answer on May 28, 2026 at 5:53 am

It seems you are confusing characters with bytes. Content-Length is in bytes, but your are counting characters.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I wrote a tokenizer for HTTP messages in Java. It has a method nextToken()

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply