I am having weird behavior with Scanner. It will work with a particular set of files I am using when I use the Scanner(FileInputStream) constructor, but it won’t with the Scanner(File) constructor.
Case 1: Scanner(File)
Scanner s = new Scanner(new File("file"));
while(s.hasNextLine()) {
System.out.println(s.nextLine());
}
Result: no output
Case 2: Scanner(FileInputStream)
Scanner s = new Scanner(new FileInputStream(new File("file")));
while(s.hasNextLine()) {
System.out.println(s.nextLine());
}
Result: the file content outputs to the console.
The input file is a java file containing a single class.
I double checked programmatically (in Java) that:
- the file exists,
- is readable,
- and has a non-zero filesize.
Typically Scanner(File) works for me in this case, I am not sure why it doesn’t now.
hasNextLine() calls findWithinHorizon() which in turns calls findPatternInBuffer(), searching a match for a line terminator character pattern defined as
.*(\r\n|[\n\r\u2028\u2029\u0085])|.+$Strange thing is that with both ways to construct a Scanner (with FileInputStream or via File), findPatternInBuffer returns a positive match if the file contains (independently from file size) for instance the 0x0A line terminator; but in the case the file contains a character out of ascii (ie >= 7f), using FileInputStream returns true while using File returns false.
Very simple test case:
create a file which contains just char “a”
now edit the file with hexedit to:
in the test java code there is nothing else than what already in the question:
SO, it turns out this is a charset issue. In facts, changing the test to:
we get: