Crazy fast build times (Or when 10 seconds starts to make you nervous)

Firstly this is not just my effort but my team and organisation.
Secondly the usual rules apply, this is expert advise and what works for me will very probably cause your whole production stack to blow up and kill people.
Lastly this is long so skip to the bit you care about or come along and chat at QCon London 2012

Well lets answer the obvious questions:
“Can this work for real apps or just fun stuff?” Yes real apps old and new.
“Does it need to be greenfield?” No but it takes a long time (months) to get builds down on old codebases.

Lets have some examples then (greenfield means it was built with speed in mind)

  1. 5+ year legacy app, down from 45 mins to 3+ minutes (this is the one we are focusing on at the moment – probably taken a year on and off to get this far) .
  2. 3+ year legacy app, down from 20 mins to 3+ minutes (this took a couple of months)
  3. 1+ year greenfield app, creeping up to 12 seconds
  4. 6 month greenfield app/lib (used by number 1), creeping up to 15 seconds (warning lights are going off here)
  5. 2 year greenfield library (used by 1,2,3,4), down to 8 seconds having peeked at 15 seconds

Here are my rules of engagement:

  1. Nothing is safe: build tool, compiler, testing library, language, web container, messaging library, persistence layer, architecture, OS, code layout, team and process.
  2. Making you app / build fast is just like making your code testable, it will make you and your app better.
  3. Following from 2: Don’t split up you tests into slow and fast, split up you app into other apps or libs. Splitting the tests is just ignoring the problem plus people will just stop looking at the slow test.
  4. Crazy is good

Build tool:

If you are using Maven (the binary) then you have no hope of ever being in the sub 10 seconds cool club but even if you are in a maven organisation you can use the maven structure and produce maven artifacts with your build but use a different tool (Make, sbt, buildr, ant etc). Think maven is an output contract not an implementation contract. You would have thought that maven and ivy would actually be fast at downloading libs as they tend to download the internet all the time but in fact they are incredible slow (they download sequentially which just blows my mind – this is part due to the stupid transitive nature of those tools).

Compiling:

Incremental or Clean? I truly wish incremental actually worked well, Make does a great job with production code but it just can’t have the insight to understand the relationship between production and test code. The second problem with incremental is that I’m always going to do a clean on CI which tends to mean incremental is only good for incremental non-pushing local builds. As I tend to push on most commits it’s little value to me. Finally incremental using javac (without Make doing the smarts) is actually slower that doing a clean compile (and jar) (i think mainly due to the the fact it checks each file and class to see if it’s exists and changed while in clean mode it only checks to see if the file exists – I hoping to fix this soon)

So I use clean.

This leads me to the actual compiler. If the compiling is slow (hey Scala (see foot note) thats you I’m talking 2) then you need to change compiler (fast scala compiler, sbt) but again they don’t work well in CI as I want a clean build always (not incremental). Right now I’m investigating extending the Java 6 compiler to remove the unneeded file exists checks and possible jaring the class files directly in the compiler (the java compiler supports writing directly to jars but it’s not used for some reason)

Pro tip: Always compile and jar in one step: This will actually speed up you build on a number of fronts, by putting it in a jar when you compile your tests it will load the jar into memory so no more disk IO: instead of lots of tiny random reads, just one nice big fat read for the zip. Do the same for your test jar and your tests will load and run quicker as well. I have timed this a number of times on projects and clean, compiler + jar and test is faster than incremental (with javac) compile and test loose files. Try it and see. (NB Large projects this might not hold but that’s another problem)

Testing:

Stop using crappy testing tools that need parse some text and don’t support refactoring. This will kill your build on two fronts:

  1. Parsing say Html and invoking Java/C# etc is going to be slow as hell in the short term.
  2. Because you can’t refactor the tests they are going to become a big noose around your neck long term. I’m talking to you Fit, Fitness, Concordion

Stop using slow acceptance testing tools like Selenium, not only are they incredible slow but they are incredibly high maintenance for very little return on investment. (To be fair to Selenium I believe you can run WebDriver with HtmlUnit directly off an InputStream)
UPDATE: Selenium has an in process project that looks very promising https://github.com/aharin/inproctester

Instead use an in-memory (no TCP/HTTP) web testing tools. A lot of trendy web frameworks have support and if yours doesn’t then maybe it’s time to look for one that does. If you can’t convert a String to Request and back again then you are probably not using a web framework that is well written or well tested, let alone fast.

If you have some production constraint that is slow (say Oracle) then separate the oracle integration tests out from your acceptance tests. I create fast production quality implementations (in memory, local disk, h2, hsql, lucene etc) that for-fill the same contract. And by this I mean I have either an abstract contract test that all implementations inherit or parametrise the test that takes a list of implementations. For more confidence you can have your CI run with Oracle and local run with H2, I normally see a factor of 3-4x faster with this setup. I prefer to just use in memory collection backed version as this can give you a factor up to 10x speed up. People always bitch about not using the exact production stack (blah blah blah) but I seem to be one of the few people actually measuring the cost/benifit.

Some fuzzy facts: On e4 (ramped up to 2 teams with maybe 20+ devs in total) we used the hsql local and oracle on CI setup. Build contained maybe 2000 tests (I don’t separate acceptance, unit etc) Local build was 56 seconds on Ubuntu, 1 minute 30 on windows (same hardware), CI was 3 mins 30 ish. We had one production oracle issue caused by the difference between oracle and hsql (empty string == null in oracle) as most where caught at CI. The bug took us max one day to fix and release but I took some heat for it from the devs but I pulled out the calculator. We had been running for 3 months, we had 12 pairs doing 10+ builds a day each, so that’s about 5 man hours a day or about 20 man days over the 3 months. So 1 day cost for 20 days saved, OK there was some reputation risk but I think hard to measure stuff like attitude and mean time to repair (when you don’t have a second team doing the same storry with full stack build) was even better. Graham Brooks did a great presentation for this for one of the Agile conference.

Language / Libraries / Web Containers etc

If it’s slow ditch it.

Yup I ditched Scala because of it’s terrible compile time. I actually re-wrote a couple of libraries/apps so I have side by side with the exact same code in one language and another. It’s scary without incremental compile (sbt) or long running processes (fast compiler) the raw scala compiler is 10x slower than Java. 10x!!! That’s just insane. With incremental it’s the same a full clean Java compile. Not only is scala slow to compile but the byte code is 2x as big as Java for the same functionality, That must translate into a performance hit. 

UPDATE: For byte code size please compare yadic-67.jar (last scala version) and yadic-68.jar (exact same functionality in java)

It’s not just scala mind:

Lucene is going the same way, between version 3.3 and 3.4, if you don’t create create a singleton for the IndexWriters and Readers it’s 30x slower to instantiate. 30x!!! FFS. Now I give you that in production you don’t create a new index 30 times in a second but for testing it’s a nightmare. At the same time Lucene is so much faster than going across the wire to a remote DB for obvious reasons.

Hibernate Session Factory can take seconds to start up, WTF you are just readying some XML or annotations. So either ditch hibernate or remove it from your acceptance tests (in-memory FTW).
UPDATE: Hibernate seems to take about 1.3 seconds to instantiate the SessionFactory class even with less than 200 line of HBM file.

Velocity is the same, if you don’t use their singleton it can takes seconds to start.
UPDATE: I’ve done some more testing and Velocity does appear to be a little better these days (100ms to start). Not great but not terrible either

Tomcat / Jetty take seconds to start, okay you can run Jetty in embedded with helps a lot but nothing, yes nothing is a fast as the the embedded Java 6 HTTP server, 10ms cold JVM, <1ms warm.

GWT is a big time sync on multiple fronts:

  1. The compile step is slow.
  2. Because you now have so much generated Javascript you feel you have to use tools like Selenium to acceptance test it.
  3. You will have so much more Java code than you really need…

Again I have numbers: Application number 2 we ripped out GWT and replaced it with some crappy MVC setup: compile time reduced by 30 seconds, acceptance testing reduced by nearly 20 minutes as we no longer needed to use IE Selenium driver. Java Code reduced by 82%,

Spring lets just leave it at that. UPDATE: 5+ year old project: Using Xml config takes 14 seconds to start (This can be reduced by using the lazy attribute). 1 year old project using annotations takes 1.3 seconds to start.

OS / Hardware

Windows disk IO is just poor, for builds it’s a killer. As I said before we saw a 50% increase in time using Windows (comapred to Ubuntu) with exactly the same hardware, code.

MacOS is faster than Windows for sure but Linux is fastest. We have dual boot with MacOSX and Ubuntu on my team and we see 15-20% improvement for exactly the same code, hardware. (These are bad boy Mac Pro machines not some Hackintosh).

SSD / RAM DISK: I don’t have exact numbers but saw around 10% decrease in time on my laptop with a new SSD.

Divide and Conquer  / Architecture

As I said before don’t split you test code, split your whole project. We have extracted libraries and applications from our legacy applications. What’s great is we actually need less slow acceptance tests because the library is more focussed and tests at the API level. While the app only needs to test that the default config works. These really helps counter another common anti-pattern I see is that people want to retest every thing at the acceptance level.

If you force logic into a library then an acceptance test is actually an API test as there is no UI.

I think that’s most of it dumped.

*Notes*

Some compile times for 8.5K loc project rewriten from Scala to Java

Scala

  • ant clean compile test package: 30 seconds
  • sbt clean compile test package: 56 seconds
  • sbt incremental compile test package: 6 seconds
  • buildr (clean compile test package): 37 seconds
  • maven2 clean verify: 42 seconds

Java

  • ant clean compile test package: 6.9~ seconds
  • make clean compile test package: 6.7~ seconds
  • make incremental compile test package: 6.4 seconds (changing one prod file and one test file)
  • make incremental compile test package: 3.8 seconds (changing one test file only)
Leave a comment ?

25 Comments.

  1. Very nice summary. Once of the reasons that integrated tests are a scam: long builds. I assume these projects don’t suffer from integration problems that could “only be found by end-to-end tests”.

  2. Oh yes, and I love the Open/Closed Principle bit about extracted libs from apps. Not enough people even think about doing this.

  3. This is an excellent point: no my projects do suffer from end-to-end testing issues and what one must always do is resist the urge to just repeat all your unit tests all over again at the higher more expensive brittle layer. Ideally you have just one or two slow smoke style tests that say “Hey looks like you plugged me in correctly” but that’s it.

    For me the value of the feedback is directly proportional to how quick and reliable it is. As Dan North said “More tests is not better, better tests is better”

  4. Some very good points (espeially point 4. in ROE ;) ), but while you’re counting and saying “I seem to be one of the few people actually measuring the cost/benifit” tell me how do I benefit from faster build time in Java when:
    1. Scala devs can build stuff in Scala n x faster than in Java – it’s not only build time that counts, also dev time
    2. Scala devs can read code in Scala n x faster (maintenance is a bitch especially in a properly growing system, not some black-box-api-crap-that-is-designed-for-any-case-scenario-and-never-going-to-be-touched-again)
    3. Scala is going to improve over time, because it’s new and on the radar, so by ditching it now, you’re building a nice “legacy” Java system simply for having faster (relative) builds.
    Also point about producing bigger byte code – do you have Java vs Scala code comparison, which you used to make that claim?

    Anyway – great article, we will probably use some advice presented – keep on pushing the envelope !!!

  5. How do you handle integration errors? I write Contract Tests that encourage me to both clarify and simplify the contracts between layers. I only integrate with The Real Thing at the very boundary of the system, when I have to talk to stuff I don’t own.

  6. A Guy:

    I don’t really think Scala is actually faster for development per se. It’s really about the libraries you choose, and the tools you use. IDE support for Java is much better than Scala but like you say Scala (and tools) will improve.

    Now all of this is opinion and I don’t want this post to switch to being about what is better in one persons opinion.

    On the byte code size please see updated link in post.

    Thanks for taking the time to give feedback

  7. > How do you handle integration errors?

    Well I think the answer is definitely not repeat all your fast unit tests at the slow integration level.

    See http://blogs.agilefaqs.com/2011/02/01/inverting-the-testing-pyramid/

    I think every time we find a problem at integration we should create a fast contract / service test that we only run once. This is the same point I made about how we did Oracle testing on E4.

    At my current client we are building example based contracts that are created by the producing service from their internal acceptance tests. We then import them into our client applications tests and verify we produce the correct request (matches the example). (In their tests they verify the response not the request)

  8. Great, great article!

    Some unorganised thoughts:

    * The excellent Graham Brooks paper/presentation you mention was “Team Pace – Keeping Build Times Down” at Agile 2008 http://submissions2008.agilealliance.org/node/2238/ and is listed on IEEE at http://www.computer.org/portal/web/csdl/doi/10.1109/Agile.2008.41

    * I like the idea of Maven as a contract, not an implementation – the Maven project structure will be its legacy. Not building parallelisation into Maven at the outset was an enormous mistake.

    * “Compile and JAR in one step” is a fantastic tip that is easily overlooked

    * Any tool that requires a singleton to avoid expensive startup times stinks of premature optimisation and/or poor design, the Velocity singleton really is the gift that keeps on giving

    * I agree that a cost/benefit analysis of in-memory vs. production database for acceptance tests is important, and tends to be overlooked… the desire for production stack uniformity tends to be strong. Parameterising DAO tests is a good strategy.

    * Running acceptance tests in parallel remains an effective method of keeping build times down, but it needs to be baked in at the start. Retrofitting parallelisation is hard due to shared data fixtures, risk of intermittency, etc.

    * My soapbox in build times is JB Rainsberger’s long standing assertion that Integration Tests Are A Scam (http://www.infoq.com/news/2009/04/jbrains-integration-test-scam) – most integration tests deliver little or no value compared to their cost. For example, why write tests that exercise a real Hibernate instance against a real Oracle instance – what are we trying to test? If we are testing our Hibernate *wrapper*, then mock Hibernate. If we are testing our Hibernate *configuration*, then assert XPaths in the HBM XML or analyse class/field annotations. An integration test that does not directly talk to a database should not use a database.

    Finally, a preface explaining what problem(s) a 10 second build solves would be good. Most teams strive for a 5-10 minute build – once a legacy application build has been sweated down to 5 minutes, what is the motivation to keep going?

  9. Thanks – I’ll check it out.

  10. What a great article.
    The shift in mindset from “this is a cool language/tool for development” to “this language/tool enables rapid build and deployment” is a big challenge for many developers.
    So what if you can write code 2x as quickly? If it takes 10x to build/deploy that code, then it’s out.

    By the way, your experience with Windows disk I/O is likely due to unoptimised code ported from a Linux original. For example, NodeJS actually runs faster on Windows than Linux because it uses native Windows I/O Completion Ports (async I/O):
    http://matthewskelton.wordpress.com/2011/12/12/learning-from-node-js-on-windows-what-can-we-learn/

  11. My QCon talk… | Yesterday I was wrong - pingback on March 9, 2012 at 12:16 am
  12. Andriy Plokhotnyuk

    Two points for speeding up maven builds for multimodule-projects:
    1. Using parallel build feature can speed up greatly (especially when using SSD): https://cwiki.apache.org/MAVEN/parallel-builds-in-maven-3.html
    2. Using -server -XX:+TieredCompilation for Scala compilation or end-to-end testing with lot of code can speed up builds up to 1.5x times

  13. Always Agile · Bodart’s Law - pingback on May 24, 2012 at 10:27 am
  14. Just watched your excellent talk on this now that it’s up on InfoQ:
    http://www.infoq.com/presentations/Crazy-Fast-Build-Times-or-When-10-Seconds-Starts-to-Make-You-Nervous

    A (hopefully) useful datapoint to add: I tried porting my REST stubbing library over to the Java 6 http server after originally reading this, and although it starts up quicker the tests run about 40% slower overall than with embedded Jetty.

  15. Have you tried increasing the number of http worker threads for the built in server, as jetty defaults to 250 I think while the java 6 sever use 8 by default. I have had feedback from a few other people who have done the same swap and seen the same improvements as me so I am wondering if there might be something else wrong. Do you have a link to the new code?

  16. It’s initialised with a ThreadPoolExecutor with (somewhat arbitrarily) 50 threads.

    The tests don’t run in parallel though, so I wouldn’t have thought the thread pool size would make much of a difference.

    The code is in the httpserver-poc branch under https://github.com/tomakehurst/wiremock

    HttpWireMockServer is where the server gets started.

  17. Once you start doing reactive programming or anything that includes lambdas compositions(functional programming).. a language like scala is clearly the way to go.. when you look at C# reactive extensions and compare it to it’s F# implementation you see clearly how implicit typesafe language with DSL come to be a lot cleaner for reactive and functional query composition programming. Pattern matching is also a must for good data structure querying in my opinion.

  18. I’m not yet a scala coder but if compile time suck that much… it’s really sad..

    I guess ultra fast SSD+RAM+ new 12 core xeon would do the trick to bring compile time lower.. if compiler use all the cores in parallel :)

  19. Hi Dan,

    great article suggesting many good ideas.
    I’m just wondering how to compile and jar java sources in one step. At least with ant I think you cannot merge the two steps in one. How do you do that?

  20. What file system do you use in Linux?

  21. Watch this space | Dan North & Associates - pingback on January 18, 2014 at 1:24 pm
  22. Always Agile Consulting · Bodart’s Law - pingback on March 21, 2014 at 12:00 am

Leave a Comment