I am trying to read data from a binary stream, portions of which should be parsed as UTF-8.
Using the InputStream directly for the binary data and an InputStreamReader on top of it for the UTF-8 text does not work as the reader will read ahead and mess up the subsequent binary data even if it is told to read a maximum of n characters.
I recognize this question is very similar to Read from InputStream in multiple formats, but the solution proposed there is specific to HTTP streams, which does not help me.
I thought of just reading everything as binary data and converting the relevant pieces to text afterwards. But I only have the length information of the character data in characters, not in bytes. Thus, I need the thing which reads characters from the stream to be aware of the encoding.
Is there a way to tell InputStreamReader not to read ahead further than is needed for reading the given number of characters? Or is there a reader that supports both binary data and text with an encoding and can be switched between these modes on the fly?
You need to read the binary portions first. Where you recognise a portion of bytes which need UTF-8 decoding you need to extract those bytes and decode it.