I’m working on writing a function in Clojure that will process a file character by character. I know that Java’s BufferedReader class has the read() method that reads one character, but I’m new to Clojure and not sure how to use it. Currently, I’m just trying to do the file line-by-line, and then print each character.
(defn process_file [file_path]
(with-open [reader (BufferedReader. (FileReader. file_path))]
(let [seq (line-seq reader)]
(doseq [item seq]
(let [words (split item #"\s")]
(println words))))))
Given a file with this text input:
International donations are gratefully accepted, but we cannot make
any statements concerning tax treatment of donations received from
outside the United States. U.S. laws alone swamp our small staff.
My output looks like this:
[International donations are gratefully accepted, but we cannot make]
[any statements concerning tax treatment of donations received from]
[outside the United States. U.S. laws alone swamp our small staff.]
Though I would expect it to look like:
["international" "donations" "are" .... ]
So my question is, how can I convert the function above to read character by character? Or even, how to make it work as I expect it to? Also, any tips for making my Clojure code better would be greatly appreciated.
I prefer this way to get a
readerin clojure. And, bycharacter by character, do you mean in file access level, likeread, which allow you control how manybytesto read?Edit
As @deterb pointed out, let’s check the source code of
line-seqI faked a
char-seqI know this
[1], but I think it shows that you can directly callchar-seqreads all chars into memory.readonBufferedReader. So, you can write your code like this:How do you think?
[1] According to @dimagog’s comment,
char-seqnot read all char into memory thanks tolazy-seq