This is an Erlang question.
I have run into some unexpected behavior by io:fread.
I was wondering if someone could check whether there is something wrong with the way I use io:fread or whether there is a bug in io:fread.
I have a text file which contains a ‘triangle of numbers’as follows:
59 73 41 52 40 09 26 53 06 34 10 51 87 86 81 61 95 66 57 25 68 90 81 80 38 92 67 73 30 28 51 76 81 18 75 44 ...
There is a single space between each pair of numbers and each line ends with a carriage-return new-line pair.
I use the following Erlang program to read this file into a list.
-module(euler67). -author('Cayle Spandon'). -export([solve/0]). solve() -> {ok, File} = file:open('triangle.txt', [read]), Data = read_file(File), ok = file:close(File), Data. read_file(File) -> read_file(File, []). read_file(File, Data) -> case io:fread(File, '', '~d') of {ok, [N]} -> read_file(File, [N | Data]); eof -> lists:reverse(Data) end.
The output of this program is:
(erlide@cayle-spandons-computer.local)30> euler67:solve(). [59,73,41,52,40,9,26,53,6,3410,51,87,86,8161,95,66,57,25, 6890,81,80,38,92,67,7330,28,51,76,81|...]
Note how the last number of the fourth line (34) and the first number of the fifth line (10) have been merged into a single number 3410.
When I dump the text file using ‘od’ there is nothing special about those lines; they end with cr-nl just like any other line:
> od -t a triangle.txt 0000000 5 9 cr nl 7 3 sp 4 1 cr nl 5 2 sp 4 0 0000020 sp 0 9 cr nl 2 6 sp 5 3 sp 0 6 sp 3 4 0000040 cr nl 1 0 sp 5 1 sp 8 7 sp 8 6 sp 8 1 0000060 cr nl 6 1 sp 9 5 sp 6 6 sp 5 7 sp 2 5 0000100 sp 6 8 cr nl 9 0 sp 8 1 sp 8 0 sp 3 8 0000120 sp 9 2 sp 6 7 sp 7 3 cr nl 3 0 sp 2 8 0000140 sp 5 1 sp 7 6 sp 8 1 sp 1 8 sp 7 5 sp 0000160 4 4 cr nl 8 4 sp 1 4 sp 9 5 sp 8 7 sp
One interesting observation is that some of the numbers for which the problem occurs happen to be on 16-byte boundary in the text file (but not all, for example 6890).
I’m going to go with it being a bug in Erlang, too, and a weird one. Changing the format string to ‘~2s’ gives equally weird results:
So it appears that it’s counting a newline character as a regular character for the purposes of counting, but not when it comes to producing the output. Loopy as all hell.
A week of Erlang programming, and I’m already delving into the source. That might be a new record for me…
EDIT
A bit more investigation has confirmed for me that this is a bug. Calling one of the internal methods that’s used in
fread:Basically, if there’s multiple values to be read, then a newline, the first newline gets eaten in the ‘still to be read’ part of the string. Other testing suggests that if you prepend a space it’s OK, and if you lead the string with a newline it asks for more.
I’m going to get to the bottom of this, gosh-darn-it… (grin) There’s not that much code to go through, and not much of it deals specifically with newlines, so it shouldn’t take too long to narrow it down and fix it.
EDIT^2
HA HA! Got the little blighter.
Here’s the patch to the stdlib that you want (remember to recompile and drop the new beam file over the top of the old one):
Now to submit my patch to erlang-patches, and reap the resulting fame and glory…