Skip to content
HN On Hacker News ↗

jank now has its own custom IR

▲ 216 points 53 comments by DASD 6d ago HN discussion ↗

Pangram verdict · v3.3

We believe that this document is fully human-written

1 %

AI likelihood · overall

Human
100% human-written 0% AI-generated
SEGMENTS · HUMAN 6 of 6
SEGMENTS · AI 0 of 6
WORD COUNT 1,510
PEAK AI % 1% · §3
Analyzed
May 18
backend: pangram/v3.3
Segments scanned
6 windows
avg 252 words each
Distribution
100 / 0%
human / AI fraction
Verdict
Human
Pangram v3.3

Article text · 1,510 words · 6 segments analyzed

Human AI-generated
§1 Human · 0%

Good news, everyone! jank has a new custom intermediate representation (IR) and we're using it to optimize jank to compete with the JVM. We'll dive into more of that today, but first I want to say thank you to my Github sponsors and to Clojurists Together for sponsoring me this whole year. You all are helping a great deal. I am still searching for a way to continue working on jank full-time with an income which will cover rent and groceries, so if you've not yet chipped in a sponsorship, now's a great time!What is an intermediate representation (IR)?Compilers often represent programs as a more abstract set of instructions than a target CPU instruction set can afford. This has a few added benefits. Firstly, the program can be represented in a way which could later be lowered to different CPU architectures, such as x86_64 or arm64. Since intermediate representations are often higher level than CPU architectures, they can generally be more portable. Secondly, IRs can be specifically designed to represent the program in a way which makes writing certain optimizations easier, such as single static assigment (SSA) form. Finally, IR designers get to choose the level of abstraction of the IR to match the semantics they're aiming to represent, which can either make an IR more general or more specific to a certain language.There are many common popular IRs, such as the JVM's bytecode, the CLR's common intermediate language (CIL), GCC's GIMPLE, LLVM's IR, and so on. Some compilers may move the program through multiple IRs during compilation.Custom IR rationaleHistorically, jank has not been an optimizing compiler. We've delegated basically all of that work to LLVM, based on the C++ or LLVM IR which we would generate. However, LLVM IR works at a very low level, compared to Clojure. It has no concept of Clojure's vars, transients, persistent data structures, lazy sequences, and so on. Clojure's dynamism is granted by a great deal of both polymorphism and indirection, but this means LLVM has very few optimization opportunities when it's dealing with the LLVM IR from jank.The optimization work done previously on jank helped optimize its runtime, and the compiler itself, but less so the code being compiled by the compiler.

§2 Human · 1%

In the past two months, I have sought to change this.I wanted an IR which operated at the level of Clojure's semantics. This would be much higher level than LLVM IR and even much higher level than JVM's bytecode. Since I'm not building a general virtual machine (VM) or compiler platform, I don't need to generalize the IR for different languages. I can make jank's IR specifically tailored to jank, which gives us even more power for optimizations. As far as I know, no Clojure dialects have taken this step.Custom IR detailsI have written a reference for jank's IR in the jank book here. This reference is targeted at people who're working on jank itself, since I'm making no promises on the stability of jank's IR at this point. However, I will copy some of that here to illustrate jank's IR and help provide a mental model for what's to come. Let's examine this simple Clojure function.(defn greet [name] (if (= "jeaye" name) (println "Are you me?!") (println (str "Hello, " name "!"))))jank's IR is stored in memory as C++ data structures, but it is renderable to Clojure data for debugging and testing. This is not full serialization, since it cannot round-trip back into the jank compiler from the IR, due to all of the Clang AST internal data we have on hand. Let's take a look at the jank IR module for this function.{:name user_greet_82687 :lifted-vars {clojure_core_SLASH_str_82694 clojure.core/str clojure_core_SLASH_println_82691 clojure.core/println clojure_core_SLASH__EQ__82689 clojure.core/=} :lifted-constants {const_82693 "!" const_82692 "Hello, " const_82690 "Are you me?!"

§3 Human · 1%

const_82688 "jeaye"} :functions [{:name user_greet_82687_1 :blocks [{:name entry :instructions [{:name greet :op :parameter :type "jank::runtime::object_ref"} {:name name :op :parameter :type "jank::runtime::object_ref"} {:name v3 :op :literal :value "jeaye" :type "jank::runtime::obj::persistent_string_ref"} {:name v4 :op :var-deref :var clojure_core_SLASH__EQ__82689 :type "jank::runtime::object_ref"} {:name v5 :op :dynamic-call :fn v4 :args [v3 name] :type "jank::runtime::object_ref"} {:name v7 :op :truthy :value v5 :type "bool"} {:name v8 :op :branch :condition v7 :then if0 :else else1 :merge nil :shadow nil :type "void"}]} {:name if0 :instructions [{:name v9 :op :literal :value "Are you me?!" :type "jank::runtime::obj::persistent_string_ref"} {:name v10 :op :var-deref :var clojure_core_SLASH_println_82691 :type "jank::runtime::object_ref"} {:name v11 :op :dynamic-call :fn v10 :args [v9] :type "jank::runtime::object_ref"} {:name v12 :op :ret :value v11 :type "jank::runtime::object_ref"}]} {:name else1 :instructions [{:name v13 :op :literal :value "Hello, " :type "jank::runtime::obj::persistent_string_ref"} {:name v14 :op :literal :value "!" :

§4 Human · 1%

type "jank::runtime::obj::persistent_string_ref"} {:name v15 :op :var-deref :var clojure_core_SLASH_str_82694 :type "jank::runtime::object_ref"} {:name v16 :op :dynamic-call :fn v15 :args [v13 name v14] :type "jank::runtime::object_ref"} {:name v17 :op :var-deref :var clojure_core_SLASH_println_82691 :type "jank::runtime::object_ref"} {:name v18 :op :dynamic-call :fn v17 :args [v16] :type "jank::runtime::object_ref"} {:name v19 :op :ret :value v18 :type "jank::runtime::object_ref"}]}]}]}jank's IR is SSA-based, meaning that each name is only assigned once. This makes entire categories of optimizations much easier to reason about. jank's IR is also represented as a control flow graph (CFG), which is composed of one or more basic blocks, each with exactly one terminating instruction (branch, jump, throw, ret, etc).As we can see from the IR module, jank handles the lifting of vars and constants, and it has instructions at the level of Clojure's semantics, for dereferencing vars, calling functions, and so on. Let's take a look at the generated C++ from this IR.extern "C" jank::runtime::object_ref user_greet_19_1(jank::runtime::object_ref const greet, jank::runtime::object_ref name) { auto const v3(const_33); auto const v4(clojure_core_SLASH__EQ__34->deref()); auto const v5(jank::runtime::dynamic_call(v4, v3, name)); auto const v7(jank::runtime::truthy(v5)); if(v7) { auto const v9(const_35); auto const v10(clojure_core_SLASH_println_36->deref()); auto

§5 Human · 0%

const v11(jank::runtime::dynamic_call(v10, v9)); return v11; } else { auto const v13(const_37); auto const v14(const_38); auto const v15(clojure_core_SLASH_str_39->deref()); auto const v16(jank::runtime::dynamic_call(v15, v13, name, v14)); auto const v17(clojure_core_SLASH_println_36->deref()); auto const v18(jank::runtime::dynamic_call(v17, v16)); return v18; } }If you compare the C++ to the IR, you can immediately see the correlation. The C++ variables are named to match the IR variables. A var dereference just becomes a call to ->deref() on the var. A dynamic call just becomes a jank::runtime::dynamic_call. This is intentional.Optimizing the IRIt took about six weeks to design and implement the IR, including reworking our C++ code generation to generate from the IR instead of from jank's AST. At this point, we're not yet running any optimization passes on the IR. However, we have everything we need to start doing that. I wanted to prioritize getting the new IR pipeline merged, rather than building it out as much as possible, since six weeks is already a long time to be branched from main. Now that the IR is merged in, my approach will be to pick up one benchmark at a time and optimize it as needed until I'm satistified and/or cannot optimize it any further. Some of those optimizations will involve the IR directly, while others will not.If you're interested in more of the technical development side of this IR, there are a few videos on the jank TV YouTube channel from the various Twitch streams I did while working on the IR. These videos go way into the weeds of implementing things.With the new IR introduced, let's jump into optimizing our first benchmark: recursive fibonacci.InterludeBefore we proceed, please consider subscribing to jank's mailing list. This is going to be the best way to make sure you stay up to date with jank's releases, jank-related talks, workshops, and so on.

§6 Human · 1%

It's very low traffic.Optimizing recursive fibonacciOur first benchmark for this round of optimization is a recursive fibonacci implementation. It's just five lines long. Our goal is to have jank be at least as fast as Clojure JVM, if not faster, but we have to earn it.(defn fibonacci [n] (if (<= n 1) n (+ (fibonacci (- n 1)) (fibonacci (- n 2)))))This may seem like a trivial benchmark to optimize. You might wonder why would this be representative of a real world application. In reality, this benchmark covers some essential aspects of the compiler and runtime.Polymorphic arithmetic and relational predicates. Basically every program crunches numbers and needs to do so quickly.Recursion. Many popular algorithms, especially in Lisps, are recursive. Being able to handle these patterns efficiently is important.Garbage creation and collection. The trash truck may come every week, but that doesn't mean we should be generating as much garbage as we can.In general, the ability for the language runtime to get out of the way. If we're trying to calculate fibonacci numbers, we don't want anything showing up in the profiler which is unrelated to fibonacci numbers.As we optimize, throughout this post, consider these four categories and how each optimization we do can be categorized.Baseline fibonacci timingWe're going to use Clojure JVM to get our baseline benchmark numbers and then we'll aim to beat those numbers with jank.Note that all numbers in this post are measured on my five year old x86_64 desktop with an AMD Ryzen Threadripper 2950X on NixOS with OpenJDK 21. When I say "JVM" in this post, I mean OpenJDK 21.❯ clojure -Sdeps '{:deps {criterium/criterium {:mvn/version "0.4.6"}}}' Clojure 1.12.4 user=> (require '[criterium.core :refer [quick-bench]]) nil

user=> (defn