Something Same

Language, Expression and Design

Thursday

13

February 2014

Making the Java Interop More Intuitive

by Chris Zheng,

I have been working at Australia Post now for the past 3 months. We currently have a big, monolithic Java code base (originally a million lines of code) that we are retrofitting with new functionality. It is in desperate need of rewrite. The plan is to reimplement the entire system in Clojure.

Most of the team is relatively new to clojure and we still have a huge legacy system to maintain. However, alot of progress has been made and so far, 5 out of 9 developers are coding in clojure, with another 2 about to start clojure related projects this week. Despite the success of clojure infiltration into a big government organisation, much work is still needed to understand and take apart the legacy code base. In preparation for the long haul ahead, I wanted to make my life as easy as possible when working with any code base. Intellij is a pretty amazing product, but having been so accustomed to working in emacs and in the repl, I had to have the following tools in order to make my java/clojure workflow more productive:

  1. a way to customize clojure.core with my own methods
  2. a way to quickly reload java code
  3. a way to inspect and debug class instances

I had previously blogged about how to complete requirements 1 and 2. vinyasa was the library I wrote to do help do that. I had originally planned to put requirement 3 into the library but as I was developing the functionality, I realised that it was quite an undertaking and put it into a project on its own.

I am very happy to announce the release of iroh - a library to inspect, manipulate and game the jvm. The stable version is 0.1.5.

You can add to your project.clj dependencies.

[im.chit/iroh "0.1.5"]

and then use it like this:

(use 'iroh.core)

However, I have it in my profiles.clj:

{:user {:dependencies [...
                        [im.chit/iroh "0.1.5"]
                       ...]
        :injections [(require 'vinyasa.inject)
                     (vinyasa.inject/inject 'clojure.core
                       [iroh.core .* .? .> .$ >ns >var])

                      .....]}}

Library Design

I just wanted to be able to have relatively simple commands to do class and instance inspection as well as to manipulate private and final fields of java instances - like a cross between clojure.reflect and wallhack-clj. However, I wanted the methods returned from my search to be documented, prettified AND executable. The use cases of this library are:

  • To explore the members of classes as well as all instances within the repl
  • To be able to test methods and functions that are usually not testable, or very hard to test:
    • Make hidden class members visible by providing access to private methods and fields
    • Make immutable class members flexible by providing ability to change final members (So that initial states can be set up easily)
  • Extract out class members into documented and executable functions (including multi-argument functions)
  • Better understand jvm security and how to dodge it if needed
  • Better understand the java type system as well as clojure's own interface definitions
  • To make working with java fun again

I don't want to copy and paste too much stuff from the readme, but there are essentially 6 macros:

>ns - for importing object elements into a namespace
>var - for importing elements into current namespace
.> - for showing type hierarchy
.? - for showing class elements
.* - for showing instance elements
.$ - for reflective invocation of objects

>var - Import as Var

We can extract methods from a Class or interface with >var. For example, lets export the method clojure.lang.IPersistentMap.without as hash-without and clojure.lang.IPersistentMap.assoc as hash-assoc:

(>var hash-without [clojure.lang.IPersistentMap without]
      hash-assoc [clojure.lang.IPersistentMap assoc])

We can now look at the docs for hash-without:

(clojure.repl/doc hash-without)
;;=> -------------------------
;;   midje-doc.iroh-walkthrough/hash-without
;;   ([clojure.lang.PersistentArrayMap java.lang.Object])
;;   ------------------
;;
;;   member: clojure.lang.PersistentArrayMap/without
;;   type: clojure.lang.IPersistentMap
;;   modifiers: instance, method, public

We can also look at its string representation:

(str hash-without)
;; => "#[without :: (clojure.lang.PersistentArrayMap, java.lang.Object) -> clojure.lang.IPersistentMap]"

As well as execute the method on

(hash-without {:a 1 :b 2} :a)
;; => {:b 2}

The same can be done on hash-assoc:

(str hash-assoc)
=> "#[assoc :: (clojure.lang.IPersistentMap, java.lang.Object, java.lang.Object) -> clojure.lang.IPersistentMap]"

(hash-assoc {:a 1 :b 2} :c 3)
;; => {:a 1 :b 2 :c 3}
```

>ns - Import as Namespace

We can extract an entire class into a namespace. These are modifiable by selectors. For example, lets export all private members of java.object.String to the test.string namespace:

(>ns test.string String :private)
;; => [#'test.string/HASHING_SEED #'test.string/checkBounds
;;     #'test.string/hash #'test.string/hash32
;;     #'test.string/indexOfSupplementary
;;     #'test.string/lastIndexOfSupplementary
;;     #'test.string/serialPersistentFields #'test.string/serialVersionUID
;;     #'test.string/value]

We can now start using the exported functions as if they were clojure functions:

(seq (test.string/value "hello"))
;;=> (\h \e \l \l \o)

.> - Type Hierarchy

.> will show the entire hierarchy of types. This is very useful for inspecting elements that we don't know the type of:

(.> 1)
;;=> [java.lang.Long
;;    [java.lang.Number #{java.lang.Comparable}]
;;    [java.lang.Object #{java.io.Serializable}]]

(.> "hello")
;;=> [java.lang.String
;;    [java.lang.Object #{java.lang.CharSequence
;;                        java.io.Serializable
;;                        java.lang.Comparable}]]

(.> {})
;;=> [clojure.lang.PersistentArrayMap
;;    [clojure.lang.APersistentMap #{clojure.lang.IObj
;;                                   clojure.lang.IEditableCollection}]
;;    [clojure.lang.AFn #{clojure.lang.MapEquivalence
;;                        clojure.lang.IHashEq
;;                        java.io.Serializable
;;                        clojure.lang.IPersistentMap
;;                        java.util.Map
;;                        java.lang.Iterable}]
;;    [java.lang.Object #{clojure.lang.IFn}]]

.? - Class Exploration

.? holds the java view of the Class declaration, staying true to the class and its members. This is a typically one to one mapping of the source code. There are many filters that can be used with .?:

  • regexes and strings for filtering of element names
  • symbols and classes for filtering of return type
  • vectors for filtering of input types
  • longs for filtering of input argment count
  • keywords for filtering of element modifiers
  • keywords for customization of return types

Get all methods in java.lang.String that start with "c". We use :name to list the name only:

(.? String  #"^c" :name)
;;=> ["charAt" "checkBounds" "codePointAt" "codePointBefore"
;;    "codePointCount" "compareTo" "compareToIgnoreCase"
;;    "concat" "contains" "contentEquals" "copyValueOf"]

Get all private, static fields of java.lang.String

(.? String :name :private :field :static)
;;=> ["HASHING_SEED" "serialPersistentFields" "serialVersionUID"]

We use the keyword :# to convert the list into an executable element (if there is only one in the list).

(def unsigned-str (.? Integer "toUnsignedString" :#))

(str unsigned-str)
;;=> "#[toUnsignedString :: (int, int) -> java.lang.String]"

(mapv #(unsigned-str 32 (inc %)) (range 6))
;;=> ["100000" "200" "40" "20" "10" "w"]

Or a multi-element (if there are more than one). Note that the function can take a string or a byte array. "new" is used to reference a Class constructor:

((.? String "new" :#) "hello")
;;=> "hello"

((.? String "new" :#) (byte-array (map byte "hello")))
;;=> "hello"

((.? String "new" :#) 1)
;;=> (throws Exception)

.* - Instance Exploration

.* lists members but in slightly different way to .? .* holds the runtime view of Objects and what methods could be applied to that instance. .* will also look up the inheritance tree to fill in additional functionality. If the first argument is a Class, then it will show all static members (including constructors) for the Class. We see the difference between .* operating on an instance and a class:

(.* (String.) #"^c" :name)
;;=> ["charAt" "clone" "codePointAt" "codePointBefore" 
;;    "codePointCount" "compareTo" "compareToIgnoreCase" 
;;    "concat" "contains" "contentEquals"]

(.* String #"^c" :name)
;;=> ["cachedConstructor" "cannotCastMsg" "cast" "checkBounds" 
;;    "checkMemberAccess" "classRedefinedCount" "classValueMap" 
;;    "clearCachesOnClassRedefinition" "clone" "copyValueOf"]

Note that String is actually an instance of java.lang.Class so it has all the methods of java.lang.Class. Whilst (String.) is an instance fo java.lang.String so it has all the methods in its inheritence tree.

.$ - Member Application

A shorthand way of accessing members is done by using .$. So for instance, in this example, there is a field in the java.lang.String called value which is normally private and final. We will first

(def a "hello")

(.$ value a) ;;=> #<char[] [C@592133b0>

(.$ value a (char-array "world")) 
;;=> true (We have changed a normally immutable field)

(.$ value a) 
;;=> #<char[] [C@54a498b0> (The pointer is different)

a ;;=> "world" (And so is its value)  

Final Thoughts

Its been really fun creating this library and I have learnt a ton about the jvm whilst doing this. The library is at https://github.com/zcaudate/iroh. Please have a play with it and I welcome all feedback. I'm beginning to understand old phrase: The deeper we can look within, the further we will go. It applies to the jvm just as much as it applies to ourselves.

comments powered by Disqus