nREPL with laziness in Clojure

26 October 2013

I was using Clojure to process github repos for use in libsearch.nicksellen.co.uk and ran into a problem. I had a function that was going through each repository and pulling out the useful information. This is quite time consuming as I use multiple API calls per repository and so I wanted to print some output to show progress.

The problem was that I was only seeing output printed after the entire operation had finished.

I'm quite new to Clojure and there are multiple things that might have been involved:

Clojure uses lazy sequences in a lot of places - values are only calculated when needed
in functional languages side effects are kind of special, and writing to stdout is a side effect
I was running the code inside Emacs Live in a Leiningen REPL - that's a lot of layers in which to add behaviour

Show me the code!

I've removed all the specific code and reduced it to this single function which exhibits identical behaviour:

(defn run []
  (let [start (System/currentTimeMillis)
        ms #(- (System/currentTimeMillis) start)]
    (println (ms) "--- begin ---")
    (map
     #(do
        (println (ms) ">" %)
        (Thread/sleep 1000)
        (str "#" %))
     [1 2 3 4 5])))

Code explanation:

iterate over each number from 1 to 5 inclusive
for each number
- sleep for 1 second
- print out {timestamp} > {number}
- map value to "#{number}"
return mapped values

Executing this in the REPL gives me (I added the ; comments in after):

> (run)
0 --- begin ---
; 5 seconds pass
(1 > 1
1003 > 2
2003 > 3
3004 > 4
4006 > 5
"#1" "#2" "#3" "#4" "#5")

[all the above printed out in one go]

Why is it not printing out as we iterate?

A couple of things to notice:

the timestamps are as we would expect - they increment in 1 second steps
the parenthesis contain both the output from println as well as a result from the map function
the println must be running - if the sleep is running then surely the println before it runs right?

As I'm writing this code at Hacker Retreat there are people around to bounce thoughts off. After a chat with a couple of people I run it within the doall function:

> (doall (run))
0 --- begin ---
; one second passes
1 > 1
; one second passes
1002 > 2
; one second passes
2003 > 3
; one second passes
3004 > 4
; one second passes
4006 > 5
("#1" "#2" "#3" "#4" "#5")

It works! And the opening parenthesis no longer contain the println output.

But why?

Lets look what doall does (as Clojure is mostly written in Clojure we can just read the source - I've been using clojuredocs.org):

(defn doall
  "When lazy sequences are produced via functions that have side
  effects, any effects other than those needed to produce the first
  element in the seq do not occur until the seq is consumed. doall can
  be used to force any effects. Walks through the successive nexts of
  the seq, retains the head and returns it, thus causing the entire
  seq to reside in memory at one time."
  {:added "1.0"
   :static true}
  ([coll]
   (dorun coll)
   coll)
  ([n coll]
   (dorun n coll)
   coll))

this doesn't do very much, most of the work is done in dorun:

(defn dorun
  "When lazy sequences are produced via functions that have side
  effects, any effects other than those needed to produce the first
  element in the seq do not occur until the seq is consumed. dorun can
  be used to force any effects. Walks through the successive nexts of
  the seq, does not retain the head and returns nil."
  {:added "1.0"
   :static true}
  ([coll]
   (when (seq coll)
     (recur (next coll))))
  ([n coll]
   (when (and (seq coll) (pos? n))
     (recur (dec n) (next coll)))))

This just repeatedly calls next on our collection to realize the values from the lazy sequence. But why did that fix the problem?

The source doc explains this is a way to prevent execution being delayed until the seq is consumed, especially when side effects are involved. This sounds sort of related to my problem.

But hang on, my println calls are not being delayed (the timestamp, and the sleep call executing proved this earlier). But adding the doall does fix the issue so it must be related somehow, I'm not comfortable just throwing a dorun in there without really understand what is happening. We must go deeper.

Aha!

Then it came to me, I'm running in a REPL right? Read Evaluate Print Loop.

Here's a quick reminder of the code:

(defn run []
  (let [start (System/currentTimeMillis)
        ms #(- (System/currentTimeMillis) start)]
    (println (ms) "--- begin ---")
    (map
     #(do
        (println (ms) ">" %)
        (Thread/sleep 1000)
        (str "#" %))
     [1 2 3 4 5])))

Lets think what happens at each stage when we run (run) in the REPL.

Step 1: Read

The text is read and evaluated as a clojure data structure. Something like:

> (def code (read-string "(run)"))
#'user/a
> (type code)
clojure.lang.PersistentList
> (type (first code))
clojure.lang.Symbol

It's given us a list of length 1 containing an unevaluated symbol - run.

Step 2: Evaluate

When we evaluate the code we see the output from the first call to println:

> (def result (eval code))
0 --- begin ---
#'user/result

Our returned value is a LazySeq and it has not been realized:

> (type result)
clojure.lang.LazySeq
> (realized? result)
false

Step 3: Print

We have a sequence so it starts by printing (. Our sequence is lazy though so it needs to realize some values from it. It reads up to *print-length* (which defaults to everything). This is when Thread/sleep and println get called for each value.

> (prn result)
(2844 > 1
3845 > 2
4847 > 3
5848 > 4
6850 > 5
"#1" "#2" "#3" "#4" "#5")
nil

The output from println is intermingled with the values inside the sequence. BUT - this still hasn't recreated the problem, this printed out a line each second.

Step 4: Loop

Goes back to waiting for more input (not particularly interesting).

What have we learnt?

(added as comments in the code)

(defn run []
  (let [start (System/currentTimeMillis)
        ms #(- (System/currentTimeMillis) start)]
    (println (ms) "--- begin ---") ; this got run in the Evaluate phase
    (map
     #(do ; everything in here got run in the Print phase (but output was suppressed)
        (println (ms) ">" %) 
        (Thread/sleep 1000) 
        (str "#" %))
     [1 2 3 4 5])))

n for network

I'm using nREPL for my REPL - the n is for network. Code is typed, sent over the network to an nrepl server, evaluated, then returned. Understanding this is the key to solving the problem.

The Evaluate phase is performed in the nREPL server - and anything written to stdout during this phase is sent back over the network for display as it occurs.

The Print phase is also performed remotely - but output is collected and sent in one go.

Using doall causes the evaluation of the lazy-seq to happen in the Evaluate phase and so send back println output as it occurs.

Conclusion

It requires a surprisingly large amount of thought process and knowledge to investigate problems like this, and on the way I end up learning a lot → learning is you updating your mental models when they fail to predict reality.

I don't have comments enabled here, but you are welcome to contact me directly. View the homepage for different ways to do that.