I wrote a binary search function as part of a larger program, but it seems to be slower than it should be and profiling shows a lot of calls to methods in clojure.lang.Numbers.
My understanding is that Clojure can use primitives when it can determine that it can do so. The calls to the methods in clojure.lang.Numbers seems to indicate that it’s not using primitives here.
If I coerce the loop variables to ints, it properly complains that the recur arguments are not primitive. If i coerce those too, the code works again but again it’s slow. My only guess is that (quot (+ low-idx high-idx) 2) is not producing a primitive but I’m not sure where to go from here.
This is my first program in Clojure so feel free to let me know if there are more cleaner/functional/Clojure ways to do something.
(defn binary-search
[coll coll-size target]
(let [cnt (dec coll-size)]
(loop [low-idx 0 high-idx cnt]
(if (> low-idx high-idx)
nil
(let [mid-idx (quot (+ low-idx high-idx) 2) mid-val (coll mid-idx)]
(cond
(= mid-val target) mid-idx
(< mid-val target) (recur (inc mid-idx) high-idx)
(> mid-val target) (recur low-idx (dec mid-idx))
))))))
(defn binary-search-perf-test
[test-size]
(do
(let [test-set (vec (range 1 (inc test-size))) test-set-size (count test-set)]
(time (count (map #(binary-search2 test-set test-set-size %) test-set)))
)))
First of all, you can use the binary search implementation provided by
java.util.Collections:If you skip the
compare, the search will be faster still, unless the collection includes bigints, in which case it’ll break.As for your pure Clojure implementation, you can hint
coll-sizewith^longin the parameter vector — or maybe just ask for the vector’s size at the beginning of the function’s body (that’s a very fast, constant time operation), replace the(quot ... 2)call with(bit-shift-right ... 1)and use unchecked math for the index calculations. With some additional tweaks a binary search could be written as follows:This is still noticeably slower than the Java variant:
binary-searchas defined above seems to take about 25% more time than thisjava-binsearch.