Ok guys, could you tell me is there a certain bug in current code:
//this all only for example so DO NOT use this terrible code
1 private void readIncomingData(SocketChannel channel){
2 try{ //10 - for simplicity sake
4 ByteBuffer buffer = ByteBuffer.allocate( 10 );
5 buffer.clear();
6 channel.read( buffer );
8 StringBuilder response = new StringBuilder();
9 buffer.flip();
10 Charset charset = Charset.forName(“UTF-8″);
12 //HERE IS THE DILEMMA !!!
13 response.append( charset.decode( buffer ) );
14 // Output the response
15 System.out.println( “Data read from client ” + response );
16 } catch (IOException e) {
17 e.printStackTrace();
18 }
19}
For example the incoming text is UTF-8 coded and consist of 9 – symbols are in range of ANCII(U+007F)
codes and the 10s is the first bit of complex UTF-8 char (U+7FFFFFFF) so the next 5 bytes of
this symbol will be only in next buffer. So the last char will be encoded wrong or be missed.
Am I right and how to fix this?
By fixing i mean decoding separate nio buffers not just the whole sequence of bytes after getting all buffers.
You have a bunch of issues here. One of them is that you might not have read an entire encoded character. Usually you need to have some means of determining you have reached the end of a message such as a message length or newline before attempting to decode it.
BTW: The largest possible code point is U+10FFFF. The largest
charvalues is U+FFFF