Blog

August 13, 2012

Zinc and Incremental Compilation

August 13, 2012

sbt is arguably the best tool for building Scala projects, with two of the key features being its incremental compiler and its interactive shell. An incremental compiler will only compile what needs to be recompiled after changes. A warm compiler can compile your code twice as fast, or faster. If you’re committed to using a build tool other than sbt, for whatever reason, then you’ve generally been out of luck for fast incremental compilation… until now.

We’ve separated out sbt’s incremental compiler and made it more widely available through a build compiler called Zinc. Zinc has already been integrated with the Scala Maven Plugin and is currently being integrated with a new build tool developed at Twitter called Pants.

Zinc

Zinc is a stand-alone version of sbt’s incremental compiler. It comes with a command-line interface and built-in Nailgun integration. Zinc can be used as an alternative to scalac, adding incremental compilation, and as an alternative to fsc, serving as a build daemon.

Running with Nailgun provides Zinc as a server, communicating commands via a client, keeping cached compilers in a warm running JVM and avoiding startup and load times. This gives the same benefits and performance as sbt’s interactive shell.

Scala Maven Plugin

Compiling with Zinc has been added to version 3.1 of the Scala Maven Plugin, bringing incremental compilation for both Scala and Java to Maven. We’ve also added support for talking to a Zinc server, which can improve compilation speeds dramatically.

Example

As a realistic example I’ve taken the first few modules of Akka 2.0 and set them up as a Maven project. The modules are akka-actor, akka-testkit, and akka-actor-tests, with both Scala and Java sources, and akka-testkit has both main and test sources. I’m running this example on a MacBook Air.

Let’s first look at a regular clean compile of all the code, without incremental compilation:

% mvn clean test-compile

[INFO] akka ................ SUCCESS [1.088s]
[INFO] akka-actor .......... SUCCESS [43.187s]
[INFO] akka-testkit ........ SUCCESS [28.756s]
[INFO] akka-actor-tests .... SUCCESS [59.122s]
[INFO] ------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------
[INFO] Total time: 2:12.289s

Okay, that gives us a baseline.

Now let’s add incremental compilation. This is enabled by setting the recompileMode configuration option to incremental. Here’s a full clean compile with incremental enabled:

% mvn clean test-compile

[INFO] Compiling 69 Scala sources and 75 Java sources to akka-actor/target/classes...
[INFO] Compiling 9 Scala sources to akka-testkit/target/classes...
[INFO] Compiling 5 Scala sources and 1 Java source to akka-testkit/target/test-classes...
[INFO] Compiling 83 Scala sources and 9 Java sources to akka-actor-tests/target/test-classes...

[INFO] akka ................ SUCCESS [1.074s]
[INFO] akka-actor .......... SUCCESS [47.382s]
[INFO] akka-testkit ........ SUCCESS [32.702s]
[INFO] akka-actor-tests .... SUCCESS [1:18.782s]
[INFO] ------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------
[INFO] Total time: 2:40.081s

This takes a little longer because the incremental compiler analyzes the compilation so that it knows how the sources relate to each other. For this project it adds around 20% overhead for a full compile.

Now that we’ve compiled everything, with the incremental analysis ready, let’s try changing just one of the tests in akka-actor-tests, in this case ActorRefSpec.scala, and recompile everything:

% mvn test-compile

[INFO] Compiling 1 Scala source to akka-actor-tests/target/test-classes...

[INFO] akka ................ SUCCESS [1.050s]
[INFO] akka-actor .......... SUCCESS [1.617s]
[INFO] akka-testkit ........ SUCCESS [0.461s]
[INFO] akka-actor-tests .... SUCCESS [13.604s]
[INFO] ------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------
[INFO] Total time: 16.878s

Perfect, only the changed file was recompiled.

What about a much more pervasive change? Let’s change the Actor trait, which will affect most of the codebase, and see what happens:

% mvn test-compile

[INFO] Compiling 1 Scala source to akka-actor/target/classes...
[INFO] Compiling 48 Scala sources and 4 Java sources to akka-actor/target/classes...
[INFO] Compiling 5 Scala sources to akka-testkit/target/classes...
[INFO] Compiling 5 Scala sources to akka-testkit/target/classes...
[INFO] Compiling 5 Scala sources to akka-testkit/target/test-classes...
[INFO] Compiling 65 Scala sources and 3 Java sources to akka-actor-tests/target/test-classes...
[INFO] Compiling 46 Scala sources and 4 Java sources to akka-actor-tests/target/test-classes...

[INFO] akka ................ SUCCESS [1.061s]
[INFO] akka-actor .......... SUCCESS [40.546s]
[INFO] akka-testkit ........ SUCCESS [36.294s]
[INFO] akka-actor-tests .... SUCCESS [1:45.205s]
[INFO] ------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------
[INFO] Total time: 3:03.249s

So in this case it actually takes longer than a clean compile, which is useful to know. The logging shows the files being compiled, starting with the initial invalidation of Actor.scala and then subsequent compiles triggered from this change.

Nailing it

When compiling from sbt’s interactive shell, the JVM is warm and the classloader containing the Scala compiler is reused. This reduces a lot of startup and load time. Zinc supports similar warmth through running as a Nailgun server, and the Scala Maven Plugin supports running as a Zinc client. This can be enabled alongside the incremental recompilation configuration option by setting useZincServer to true.

Let’s look at the same example compiles as above, a clean compile, a change at one of the leaves of the source tree, and a fundamental change, but using a Zinc server. First, we need to start Zinc:

% zinc -start

Now a full clean compile:

% mvn clean test-compile

[INFO] akka ................ SUCCESS [1.094s]
[INFO] akka-actor .......... SUCCESS [47.747s]
[INFO] akka-testkit ........ SUCCESS [12.196s]
[INFO] akka-actor-tests .... SUCCESS [48.081s]
[INFO] ------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------
[INFO] Total time: 1:49.261s

This is already faster than the earlier clean compiles, with or without incremental compilation, as the same Zinc compiler is reused across the four compile runs.

This will get faster again as the JVM warms up. Let’s try five more full clean compiles and then see what we have:

% mvn clean test-compile

[INFO] akka ................ SUCCESS [1.443s]
[INFO] akka-actor .......... SUCCESS [13.199s]
[INFO] akka-testkit ........ SUCCESS [6.268s]
[INFO] akka-actor-tests .... SUCCESS [33.751s]
[INFO] ------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------
[INFO] Total time: 54.890s

Nice! This is almost a third of the time compared to the full clean compile with a cold incremental compiler. The warm incremental compiler, after a handful of compiles, knocks 60% off the time of the baseline clean compile.

Let’s try the change to one of the tests, ActorRefSpec.scala:

% mvn test-compile

[INFO] Compiling 1 Scala source to akka-actor-tests/target/test-classes...

[INFO] akka ................ SUCCESS [1.044s]
[INFO] akka-actor .......... SUCCESS [0.781s]
[INFO] akka-testkit ........ SUCCESS [0.223s]
[INFO] akka-actor-tests .... SUCCESS [2.910s]
[INFO] ------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------
[INFO] Total time: 5.096s

The total time is a third of the cold compile time. The akka-actor-tests module which contains the changed file is compiled in 2.9 seconds, compared with 13.6 seconds. Four times faster.

And the change to the Actor trait, triggering widespread recompilation:

% mvn test-compile

[INFO] Compiling 1 Scala source to akka-actor/target/classes...
[INFO] Compiling 48 Scala sources and 4 Java sources to akka-actor/target/classes...
[INFO] Compiling 5 Scala sources to akka-testkit/target/classes...
[INFO] Compiling 5 Scala sources to akka-testkit/target/classes...
[INFO] Compiling 5 Scala sources to akka-testkit/target/test-classes...
[INFO] Compiling 65 Scala sources and 3 Java sources to akka-actor-tests/target/test-classes...
[INFO] Compiling 46 Scala sources and 4 Java sources to akka-actor-tests/target/test-classes...

[INFO] akka ................ SUCCESS [1.025s]
[INFO] akka-actor .......... SUCCESS [10.854s]
[INFO] akka-testkit ........ SUCCESS [8.779s]
[INFO] akka-actor-tests .... SUCCESS [57.491s]
[INFO] ------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------
[INFO] Total time: 1:18.289s

Using the Zinc server gives us sbt-like compilation from Maven, and a warm incremental compiler can compile code twice as fast, or even faster. Incremental compilation can also be used with the continuous compile (scala:cc) goal in the Scala Maven Plugin. For more information about using incremental compilation in Maven see the documentation for Scala Maven Plugin.

Pants

Pants is a build tool developed at Twitter and is modeled after Google’s build system. It’s intended to support selective compilation within large, heterogeneous codebases with an explicitly-modeled DAG structure. Pants is used internally at Twitter and at foursquare, and is also used to build the twitter/commons open-source project.

Until recently, Pants invoked the scala compiler directly. If any file in a DAG node changed, the entire node and anything depending on it had to be rebuilt. Benjy Weinberger from foursquare is currently working on integrating Zinc with Pants, so that it too can enjoy the benefits of incremental Scala compilation:

The tricky part of this integration is the fact that Pants contains its own node-level invalidation logic. Pants also has options to both ‘flatten’ and ‘unflatten’ the build.

The advantage of the non-flat mode is that the compiler only has to bite off small chunks at a time. In small codebases this is slower than building everything in a single pass, because you incur the compiler startup overhead multiple times (although Zinc’s cached compilers can help with this). However in large codebases it can be impractical to build everything in a single pass. Building a couple of thousand Scala source files may require 8GB or more of RAM and can take 20-30 minutes. Breaking the codebase down into a modular DAG, and applying Zinc’s incremental compilation to each node, supports effective builds and fast rebuilds even in a large codebase.

The combination of Pants and Zinc will allow companies like Twitter and foursquare to work productively in a single shared, large-scale codebase, even if any individual developer only builds a small subset of the codebase.

Summary

We’ve separated out sbt’s incremental compiler and made it available both stand-alone and ready for integration with other build tools. Using Zinc as a build daemon, similar to sbt’s interactive shell, can dramatically improve build times. Zinc has already been integrated with the Scala Maven Plugin, and we’re expecting incremental compilation to find its way into other build tools for Scala. Zinc is also being integrated with Pants, a build tool for large-scale codebases.

Enjoy!

comments powered by Disqus
Browse Recent Blog Posts