clj.orcery

Language, Expression and Design

Tuesday

12

November 2013

You took 3 months to write a mutable array for clojure?

by Chris Zheng, on random, ova, arrays

Yes I did. The library is hosted here. It now also includes very comprehensive documentation. There are in total, 350 lines of code, which makes my average output for that period of time a whopping 3 lines per day.

The follow up question would then be: What took so long?

Answer: Simplicity


From the Documentation

An ova represents a mutable array of elements. It has been designed especially for dealing with shared mutable state in multi-threaded applications. Clojure uses refs and atoms off the shelf to resolve this issue but left out methods to deal with arrays of shared elements. ova has been specifically designed for the following use case:

  • Elements (usually clojure maps) can be added or removed from an array
  • Element data are accessible and mutated from several threads.
  • Array itself can also be mutated from several threads.

These type of situations are usally co-ordinated using a external cache store like redis. Ova is no where near as fully featured as these libraries. The actual ova datastructure is a ref containing a vector containing ref. The library comes with a whole bundle of goodies to deal with mutation:

  • Clean element selection and array manipulation syntax
  • Watches for both the array and array elements
  • Designed to play nicely with dosync and refs
  • Pure clojure

The library has been abstracted out of cronj, a task scheduling library where it is used to track and manipulate shared state. The ova syntax abstracts away alot of clutter. An example of tracking state in a multi-threaded environment can be seen in a scoreboard example:


Memoirs of an Clojure Newbie

I wasn't working on the library full time and I did way more reading/experimentation than actual productive coding. Having said that, ova was the first clojure library that I started writing. It used to be part of hara - a yet to be documented general purpose utilities library. I rewrote hara.ova about 5 or 6 times over a 3 month period but even after I had finished, it took me almost a year before I felt like I had garnered enough experience to be able come back to the library and rework the rough edges.

In revisiting the code, I extracted out hara.ova into a seperate ova.core project and it was only two weeks ago that I finished the documentation. The completion of ova marked the end of my phase as a newbie clojure developer. I am writing this it for a couple of reasons:

  • I am actively promoting ova on my blog because I believe that the library quite a very polished piece of work that provides an elegant datastructure for multithreaded applications. It solves a similar problem to megaref.
  • I want to go through some reoccuring themes that I have come across that are quite specific for clojure. Many the ideas implemented in this library are also implemented in my other libraries.
  • I also wish to write about the work I threw away that still has relevence to the future work. ova may be only 350 lines of code, but I spent months thinking about and experimenting with features that are in there. Much much more code was cut out than left in.
  • The sections are a little disjointed but should be taken as a whole. Combined, these are the main ideas that I thought about working on ova
    • Predicates as Datastructure
    • Clojure Protocols
    • CRUD Operations
    • Mixing with Java (Threw Away)
    • State Watching and Multithreading (Threw Away)
    • Macros

Predicates as Datastructure

I'm quite proud of the expressiveness of the ova syntax. We can see below the number of possible ways that we can select elements out of the array. Here is just a sample of what can be done using the select function.

(def ov (ova [{:val 1} {:val 2} {:val 3}
              {:val 4} {:val 5} {:val 6}
              {:val 7} {:val 8}]))

(select ov 0)                      ;; By Index
=> #{{:val 1}}

(select ov #{0 1})                 ;; By Set of Index
=> #{{:val 1} {:val 2}}

(select ov {:val 3})               ;; By Item
=> #{{:val 3}}

(select ov #{{:val 3} {:val 4}})   ;; By Set of Items
=> #{{:val 3} {:val 4}}

(select ov #(-> % :val even?))     ;; By Predicate
=> #{{:val 2} {:val 4}
     {:val 6} {:val 8}}

(select ov '(:val even?))          ;; By List
=> #{{:val 2} {:val 4}
     {:val 6} {:val 8}}

(select ov [:val 3])               ;; By Vector/Value
=> #{{:val 3}}

(select ov [:val #{1 2 3}])       ;; By Vector/Set
=> #{{:val 1} {:val 2} {:val 3}}

(select ov [:val '(< 4)])         ;; By Vector/List
=> #{{:val 1} {:val 2} {:val 3}}

(select ov [:val even?            ;; By Vector/Predicate/List
            :val '(> 4)])
=> #{{:val 6} {:val 8}}

All the read, update and delete functions in ova.core can use the above conventions. More conventions can be seen in the indices selection section. The concept of using sets to represent the or operator and vectors to represent the and operator has also been incorporated in ribol. Another novelty that I introduced was for the use of lists as predicates. In this way, simple functions can be represented as lists. This has the advantage of being able to serialize as edn. This type of notation has also been incorporated in adi.

Protocols

Having come from an object orientated background, Protocols were quite a hard concept to understand. Extending date protocols provided the best example of what they were and how to use them. After staring at the implementation of clj-time for the longest time, the concept finally clicked for me.

Some examples of using clojure methods on the ova can be seen here. In all, ova implements the following protocols:

  • java.lang.Object (toString)
  • clojure.lang.IFn (invoke)
  • clojure.lang.Seqable (seq)
  • clojure.lang.Counted (count)
  • clojure.lang.Indexed (nth)
  • clojure.lang.ILookup (valAt)
  • clojure.lang.ITransientVector (assocN)
  • clojure.lang.ITransientAssociative (assoc)
  • clojure.lang.ITransientCollection (conj)
  • clojure.lang.IRef (setValidator, getValidator, getWatches, addWatch, removeWatch)
  • clojure.lang.IDeref (deref)
  • ova.core.OvaProtocol (empty!, get-ref, clear-watches, add-elem-watch, remove-elem-watch, clear-elem-watches, get-filtered)

My experience with extending the built-in clojure protocols have not been that pleasant. Firstly, the protocols were not very well documented and it was quite a steep learning curve from the nice clojure syntax into java-interop code. I had to really dig through the source code to get anywhere. I relied on grep code to figure out all the interfaces that ova needed to extend. Another problem I had with using protocols was that they were much harder to debug that normal clojure functions. The errors were not as clear as they could have been.

All in all, better support for protocols would be most welcomed. Having said that, after having gone through extending clojure protocols once, it made life a lot easier when I was writing the purnam.type module. All the techniques I had learnt building ova were directly transferable to clojurescript and it turned out to be quite a fun experience.

CRUD

I learnt alot about CRUD and all its intricies whilst writing ova. It never occured to me that the implementation of all the functions (insert, select, update, delete) are so different to each other. Furthermore, CRUD on array elements is a very different beast to CRUD on map elements. ova implements an array of objects and so it was very interesting trying to figure out how best to implement a range of methods for CRUD using this datastructure. I implemented multiple CRUD interfaces:

  • CREATE (append! insert! concat!)
  • READ (select selectv <<)
  • UPDATE (map! smap! !! !>)
  • DELETE (empty! remove! filter!)

As I was developing ova, datomic came out. The experience I gained from implementing CRUD (especially thinking about how to manipulate nested maps) really help me with the developement of the semantics of adi.

Java

In it's earliest incarnation, ova was named dyna and the actual datastructure was implemented using gen-class. I wanted to try out the :gen-class and :aot features of a clojure project:

(ns hara.data.dyna-rec
  (:use [hara.fn :only [deref*]])
  (:require [clojure.string :as s])
  (:gen-class
   :name hara.data.DynaRec
   :prefix "-"
   :init init
   :constructors {[] []
                  [clojure.lang.Sequential] []}
   :state state
   :extends clojure.lang.AFn
   :implements [clojure.lang.IDeref
                clojure.lang.Seqable
                clojure.lang.ILookup
                clojure.lang.ITransientMap]
   :methods [[getRequired [] clojure.lang.Seqable]
             [setRequired [clojure.lang.Seqable] void]
             ]))

The next incarnation of was called eva which experimented with reimplementing clojure's atoms to use watch functions that take additional parameters. This was the first instance where I combined java and clojure in the same project.

  :java-source-paths ["src/java"]
  :source-paths ["src/clojure"]

Development wasn't as smooth as I would have liked as I had problems reloading classes that had been previously aot-compiled. The atom-like datastructure was called an evom and it usage looked like this:

(def ev (evom 1))

(add-watch ev :print 
  (fn [_ _ p n f & args]
    (println "Hello, " (apply f p args))))

(swap! ev + 1 2 3)
;; outputs "Hello 7"

I imagined that instead of just giving the watch function the previous and next values of the atom, the entire function with along with arguments are also passed when swap! was called. I wanted to experiment with having a whole network of these evoms that could potentially make decisions based on the function that was passed in. Later on, I watched Sussman's lecture on propagators and realised that this type of architecture required more thinking. kiran was my own attempt to implement the propagators architecture but the ideas still have to cook for a while before anything will come out of it.

The reason that I didn't continue with this paradigm was because I realised that I needed to use refs instead of atoms to allow synchronous transactions. There was a whole lot more code for writing a customised ref than for writing a customised atom and so I gave up and went back to just using good old clojure refs instead. The lastest incarnation does not use java aot compiled classes any more. The ova.core.Ova datastructure is just a deftype implementing a bunch of clojure protocols. Sometimes its the simple things in life that are often the best.

State Watching and Multithreading

Many of my ideas of how to test multithreaded applications have come about whilst working with ova. I also got help from stackoverflow.

Alot of the ideas are currently in the hara.state namespace. Some of them include:

add-change-watch

Adds a watch function that only triggers when there is change in the predicate.

(def subject (atom {:a 1 :b 2}))
(def observer (atom nil)
(add-change-watch subject :clone
    :b (fn [& _] (reset! observer @a)))

(swap! subject assoc :a 0)
@observer => nil

(swap! subject assoc :b 1)
@observer => {:a 0 :b 1}
latch

Latches two irefs together so that when master changes, the slave will also be updated

(def master (atom 1))
(def slave (atom nil))

(latch master slave #(* 10 %))
(swap! master inc)
@master ;=> 2
@slave ;=> 20
dispatch!

Updates the value contained within a ref or atom using another thread.

(dispatch! (atom 0)
            (fn [x] (Thread/sleep 1000)
                    (inc x)))
;=> <future_call>
wait-for

Waits for a long running multithreaded function to update the ref. Used for testing purposes

(defn slow-inc [n] (Thread/sleep 100000) (inc n))
(def atm (atom 1))
;; concurrent call
(def f #(dispatch! % slow-inc))
(def ret (wait-for f atm))

@atm ;=> 2
@ret ;=> 2

I haven't properly sat down to think about how to best package these functions yet but I now have the tools and semantics to test multi-threaded applications with clojure.

Macros

Two macros have been defined in ova. They are

(defmacro << [& forms]
  `(let [out# (dosync ~@forms)]
     (persistent! out#)))

(defmacro !>
  [ova pchk & forms]
  `(smap! ~ova ~pchk
          #(-> % ~@forms)))

Very simple I know but they were the first proper macros I wrote. I have since written alot more but I have to attribute everything back to these one here.

Conclusion

So that's it. Apologies for this post being quite random. I also felt this way when I was developing ova. Only when everything came together did I realise that it turned out to be something pretty decent. Please have a look at the docs as well as a play around with the walkthrough. I would love some feedback on what people think!

comments powered by Disqus