clj.orcery

Language, Expression and Design

Sunday

10

November 2013

Immutability, time and testable task schedulers

by Chris Zheng, on testing, immutability

Immutability is great. Code written with such a principle are generally much easier to test and much more modular than code that is not. In the Joy of Clojure, immutability over time was compared to an animated flip book. Every page represented an instance of time and it was only by flipping through animations that time was percieved.

Datomic showed that immutability was possible even for the database (which was something that we all once thought was impossible to make immutable). What Datomic gave us was the ability to 'time travel' through our data. The ability to go back and forth in time without affecting the present is a superpower that I never had before. So my brain is still getting used to writing programs that accomodate this type of behaviour. Even though more than a year has passed since Datomic was released, I am still learning new and amazing things as I work with this wonderful database.

What still amazes me has been that the tests I could not write before for the 'place-based' databases are now so easy on Datomic and I can reason about my data in a much more concise and clear way. I took the same principle of immutability of data and reworked my task-scheduling library so that I could reason about tasks over time the same way that I can reason about data over time.

cronj

cronj was built for a project of mine back in 2012. The system needed to record video footage from multiple ip-cameras in fifteen minute blocks, as well as to save pictures from each camera (one picture every second). All saved files needed a timestamp allowing for easy file management and retrieval.

At that time, quartzite, at-at and monotony were the most popular options. After coming up with a list of design features and weighing up all options, I decided to write my own instead. As a core component of the original project, cronj has been operational now since October 2012. A couple of major rewrites and api rejuggling were done, but the api has been very stable from version 0.6 onwards.

There are now many more scheduling libraries in the clojure world:

With so many options, and so many different ways to define task schedules, why am I still glad that I wrote cronj?

Design

cronj was built around a concept of a task. A task has two components:

  • A handler (what is to be done)
  • A schedule (when it should be done)

Tasks are triggered by a scheduler who in-turn is notified of the current time by a timer. If a task was scheduled to run at that time, it's handler would be run in a seperate thread.

Handler

A task handler is just a function taking two arguments t and opts:

   (fn [t opts]
      (... perform a task ...)))

time

t represents the time at which the handler was called. This solves the problem of time synchronisation. For example, I may have three tasks scheduled to run at a same time:

  • perform a calculation and write the result to the database
  • perform a http call and write result to the database
  • load some files, write to single output then store file location to the database.

All these tasks will end at different times. To retrospectively reasoning about how all three tasks were synced, each handler is required to accept the triggred time t as an argument.

opts

is a hashmap, for example {:path '/app/videos'}. It has been found that user customisations such as server addresses and filenames, along with job schedules are usually specified at the top tier of the application whilst handler logic is usually in the middle-tier. Having an extra opts argument allow for better seperation of concerns and more readable code.

Simulations

The idea of simulation is not a new concept but surprisingly, task schedulers don't really have this option available I got increasly frustrated at how un-testable the video system was with the first incarnation of cronj, wrote a test harness for it and found that what might have taken 15 minutes of setup and debugging would just be a couple of lines of simulation code, usually taking less than 10 seconds to run through a day's worth of scheduled tasks.

A simple example is given below. We first define a task scheduler:

(def cj
  (cronj :entries
         [{:id "print-task"
           :handler (fn [t opts] (println (:output opts) ": " t)))
           :schedule "/2 * * * * * *"
           :opts {:output "Hello There"}}}

Calling start! on cj will start the timer and print-handler will be triggered every two seconds. Calling stop! on cj will stop all outputs

(start! cj)

;; > Hello There :  #<DateTime 2013-09-29T14:42:54.000+10:00>

       .... wait 2 secs ...

;; > Hello There :  #<DateTime 2013-09-29T14:42:56.000+10:00>

       .... wait 2 secs ...

;; > Hello There :  #<DateTime 2013-09-29T14:42:58.000+10:00>

       .... wait 2 secs ...

;; > Hello There :  #<DateTime 2013-09-29T14:43:00.000+10:00>

(stop! cj)

Controlling Time

With simulations, we can write tests that allow us to test our scheduler at any point in time we wanted that function to be in. For instance, we wish to test that our print-handler method was not affected by the Y2K Bug. T1 and T2 are defined as start and end times:

(def T1 (local-time 1999 12 31 23 59 58))

(def T2 (local-time 2000 1  1  0  0 2))

We can simulate events by calling simulate on cj with a start and end time. The function will trigger registered tasks to run beginning at T1, incrementing by 1 sec each time until T2. Note that in this example, there are three threads created for print-handler.

(simulate cj T1 T2)

;; > Hello There :  #<DateTime 1999-12-31T23:59:58.000+11:00>
;; > Hello There :  #<DateTime 2000-01-01T00:00:00.000+11:00>
;; > Hello There :  #<DateTime 2000-01-01T00:00:02.000+11:00>

Interval and Pause

Two other arguments for simulate and simulate-st are:

  • the time interval (in secs) between the current time-point and the next time-point (the default is 1)
  • the pause (in ms) to take in triggering the next time-point (the default is 0)

It can be seen that we can simulate the actual speed of outputs by keeping the interval as 1 and increasing the pause time to 1000ms

(simulate cj T1 T2 1 1000)

;; > Hello There :  #<DateTime 1999-12-31T23:59:58.000+11:00>

       .... wait 2 secs ...

;; > Hello There :  #<DateTime 2000-01-01T00:00:00.000+11:00>

       .... wait 2 secs ...

;; > Hello There :  #<DateTime 2000-01-01T00:00:02.000+11:00>

Speeding Up

In the following example, the interval has been increased to 2 seconds whilst the pause time has decreased to 100ms. This results in a 20x increase in the speed of outputs.

(simulate cj T1 T2 2 100)

;; > Hello There :  #<DateTime 1999-12-31T23:59:58.000+11:00>

       .... wait 100 msecs ...

;; > Hello There :  #<DateTime 2000-01-01T00:00:00.000+11:00>

       .... wait 100 msecs ...

;; > Hello There :  #<DateTime 2000-01-01T00:00:02.000+11:00>

Being able to adjust these simulation parameters are really powerful testing tools and saves an incredible amount of time in development. For example, we can quickly test the year long output of a task that is scheduled to run once an hour very quickly by making the interval 3600 seconds and the pause time to the same length of time that the task takes to finish.

Through simulations, task-scheduling can now be tested and entire systems just got easier to manage and reason about.

More Features

There are many more features of cronj that can be found in its documentation, the more prominent being it fully featured thread management capabilities:

  • tasks can be triggered to start manually at any time.
  • tasks can start at the next scheduled time before the previous thread has finished running so that multiple threads can be running simultaneously for a single task.
  • pre- and post- hooks can be defined for better seperation of setup/notification/cleanup code from handler body.
  • running threads can be listed.
  • normal and abnormal termination:
    • kill a running thread
    • kill all running threads in a task
    • kill all threads
    • disable task but let running threads finish
    • stop timer but let running threads finish
    • shutdown timer, kill all running threads

Please have a play and let me know what you think!

comments powered by Disqus