Stratus3D

Software Engineering, Web Development and 3D Design

Running Dialyzer in Jenkins Builds

Writing tests is an important way of validating the correctness of the software you are building and is a common practice for Erlang and Elixir projects. Type checking is also an important way of validating the correctness of the software you are working on. Dialyzer is still a valuable tool that is good at finding type errors in Erlang and Elixir source code. If you aren’t running Dialyzer on your Erlang and Elixir code your build is likely not catching bugs that could easily be found and fixed. Running Dialyzer automatically during Jenkins builds is a good way of ensuring obvious type errors are found and fixed.

In this blog post I will explain how I set up Jenkins to run Dialyzer during a Docker build for an Elixir project. I will also show how to cache Dialyzer’s PLT (persistent lookup table) files between builds using Jenkins artifacts and a Docker volume to speed up Dialyzer runs.

The instructions in this article are for Elixir 1.10 projects running in Docker, but the same steps can be applied to Erlang projects with minimal changes. The primary difference between the Elixir and Erlang builds would be the build paths.

Running Dialyzer

Running Dialyzer is pretty easy.

Add the dialyxir package to your project’s mix.exs file. Make sure to specify the environments the command will be needed in. In my case I needed the dialyzer command in dev and test environments:

defp deps do
  [
    ...
    {:dialyxir, "~> 1.0", only: [:dev, :test], runtime: false}
  ]
end

Running Dialyzer in Docker is as simple as adding the Dialyzer command right below the place where you run your Elixir tests:

RUN mix test
RUN MIX_ENV=test mix dialyzer

While this works, running Dialyzer during the Docker build like this isn’t going to allow us to reuse PLT files. There is a better way.

Reusing PLT Files

Building the type information is time consuming so Dialyzer writes the data to PLT files on disk. If you run Dialyzer during the Docker build in Jenkins the PLT files will not be present and Dialyzer will recreate them for every build. Building the PLT files will likely add 10-15 minutes to your build time. To speed up builds it is possible to cache files with Jenkins artifacts and reuse them on subsequent builds. Since we are using Docker the PLT files will be located inside the Docker container, so we will need a way to get the PLT files in and out of the Docker container.

Sharing PLT Files Between the Container and the Host

Putting files into a Docker container is easy. A simple COPY <source> <destination> command in the Dockerfile is all that’s needed. But we need to access the same PLT files both inside and outside of the container. Inside the container the PLT files will be used by Dialyzer to speed up the type analysis. After Dialyzer has finished inside the container the updated files will be needed outside of the container in the Jenkins build directory so they can be cached as Jenkins artifacts.

The easiest way of sharing files between the container and the host file system is a Docker volume. In order to use a Docker volume, you’ll need to first build the container with the application code, and then run the Dialyzer command with a volume mounted in the container.

Dockerfile

When running a container using docker run, you’ll need to use the --mount flag like this:

$ docker run ... --mount 'type=volume,src=/dialyxir,dst=/app/dialyxir,read_only=false'

Docker Compose

My Jenkins build actually ran mix dialyzer in a container that was started with Docker Compose. Creating a volume is also easy with Docker Compose:

volumes:
  - type: bind
    source: /dialyxir
    target: /app/dialyxir
    read_only: false

Putting PLT Files in the Volume

When Dialyzer runs in Elixir, it’s not going to write the PLT files to dialyxir/, it’s going to write them _build/test/. After Dialyzer runs we will need to copy the PLT files into the volume so they can be accessed on the host file system. We need to add two commands after the mix dialyxir command for this. The easiest way to do all this is to create a Bash script in the container that you can invoke with docker run. The script should look something like this. I named my script run_checks.sh:

run_checks.sh
#!/usr/bin/env bash
set -euo pipefail
IFS=$'\t\n' # Stricter IFS settings

# Run tests as usual if you want
mix test
# Run Dialyzer
MIX_ENV=test mix dialyzer
# Create the dialyxir directory if it doesn't already exist
mkdir -p dialyxir
# Copy the PLT files into the dialyxir directory
cp -f _build/dev/dialyxir* dialyxir
Note

In the past I have encountered permissions issues with files shared with the host through a Docker volume. Sometimes the host does not have permission to read the files, preventing Jenkins from uploading the PLT files as artifacts. You can work around this issue by copying the files into a new directory, and setting permission on that directory and its contents to 777:

cp -f _build/dev/dialyxir* dialyxir/new
chmod -R 777 dialyxir/new

Then outside the container in the Jenkins build copy the files from dialyxir/new/ into the parent dialyxir/ directory before saving them as Jenkins artifacts.

Now you can run the script using the docker run command after you have built the docker image:

$ docker run <container>:<tag> run_checks.sh ... --mount 'type=volume,src=/dialyxir,dst=/app/dialyxir,read_only=false'

After running this command you should see the PLT files in a dialyxir directory inside the same directory that you ran the command in. You now have something that you can cache using Jenkins artifacts.

Saving PLT Files as Jenkins Build Artifacts

If you are using Jenkins pipelines uploading the PLT files as Jenkins artifacts can be done as a stage that runs after the docker run that generates the PLT files. You only want to save PLT files for builds of the master branch. Feature branches may generate PLT files that contain new dependencies that are not used by the master branch or other feature branches and may result in PLT files aren’t suitable for use in later builds.

    stage('Upload Build Artifacts') {
      when {
        branch 'master'
      }
        stage('Upload Artifacts') {
          steps {
            // Upload the PLT files as Jenkins artifacts
            archiveArtifacts artifacts: 'dialyxir/*', onlyIfSuccessful: true
          }
        }
      }

Downloading PLT File Artifacts for Reuse

Now that the build is storing PLT files in Jenkins artifacts subsequent builds can download and use them instead of rebuilding the PLT files from scratch. This can be easily done with a Jenkins stage that downloads the files before running the run_checks.sh script in Docker:

  stages {
    stage('Download Artifacts') {
      steps {
        // download dialyzer artifacts from jenkins
        copyArtifacts(projectName: '<project name>/master', target: 'dialyxir', optional: true, flatten: true)
      }
    }
    ...

Summary

With these two new stages in place your builds should now be able to upload generated PLT files after running Dialyzer on your project’s master branch and download the cached PLT files on subsequent builds of any branch. PLT files will never need to be re-created and will only need to be updated when dependencies change. With PLT file caching my Elixir build times improved from an average of 15-20 minutes to around 4 minutes.

Example Files

Below is a Dockerfile and a Jenkinsfile so you can see what the final files might look like after all these changes.

Example Dockerfile:

FROM elixir:1.10.2-alpine AS base

LABEL project="<project name>"
ENV APP_ROOT=/app

WORKDIR $APP_ROOT

# Add dependencies
FROM base AS deps

RUN apk add coreutils

# Copy in application code
FROM deps AS application

COPY . $APP_ROOT

# Generate final image
FROM application AS build

# Fetch the Elixir dependencies and compile and release the application
RUN MIX_ENV=prod mix do deps.get, compile, release

# Copy release into the rel directory for convenience
COPY --from=build $APP_ROOT/_build/prod/rel rel

ENTRYPOINT ["/app/rel/<app name>/bin/<app name>"]
CMD ["start"]

Example Jenkinsfile:

pipeline {
  stages {
    stage('Download Artifacts') {
      steps {
        // Download PLT files from Jenkins
        copyArtifacts(projectName: '<project name>/master', target: 'dialyxir', optional: true, flatten: true)
      }
    }

    stage('Build Docker Image') {
      sh 'docker build .'
    }

    stage('Run Tests and Dialyzer') {
      // Run Dialyzer with shared volume so we can share PLT files between Jenkins and the container
      sh 'docker run --mount "type=volume,src=/dialyxir,dst=/app/dialyxir,read_only=false" <project name> ./run_checks.sh'
    }

    stage('Upload Build Artifacts') {
      when {
        branch 'master'
      }

      stage('Upload Artifacts') {
        steps {
          // Upload the PLT files as Jenkins artifacts
          archiveArtifacts artifacts: 'dialyxir/*', onlyIfSuccessful: true
        }
      }
    }
  }
}

The Let It Crash Philosophy Outside Erlang

Abstract

Erlang is known for its use in systems that are fault tolerant and reliable. One of the ideas at the core of the Erlang runtime system’s design is the Let It Crash error handling philosophy. The Let It Crash philosophy is an approach to error handling that seeks preserve the integrity and reliability of a system by intentionally allowing certain faults to go unhandled. In this talk I will cover the principles of Let It Crash and then talk about how I think the principles can be applied to other software systems. I will conclude the talk by presenting a couple real world scripts written in various languages and then showing how they can be improved using the Let It Crash principles.

What is Let It Crash?

Let it crash is a fault tolerant design pattern. Like every design pattern, there are some variations on what it actually means. I looked around online and didn’t find a simple and authoritative definition of the term. The closest thing I found was a quote from Joe Armstrong.

only program the happy case, what the specification says the task is supposed to do

That’s a good, terse, description of what Let It Crash is. It’s about coding for the happy path and acknowledging the fact there are faults that will occur that we aren’t going to be able to handle properly because we don’t know how the program ought to handle them.

So how does approach improve our programs? And how do we apply this in practice?

Let It Crash principles

While I didn’t find an authoritative definition of what the Let It Crash philosophy was I have several years of experience applying it while developing software in Erlang. The core tenet of Let It Crash is code for the happy path. There are two other guiding principles that I follow as well, so here are my three principles of Let It Crash.

  • Code for the happy path

    Focus on what your code should do in the successful case. Code that first and only if necessary deal with some of the unhappy paths. You write software to DO something. Most of the code you write should be for doing that thing, not handling faults when they occur.

  • Don’t catch exceptions you can’t handle properly

    Never catch exceptions when you don’t know how to remedy them in the current context. Trying to work around something you can’t remedy can often lead to other problems and make things worse at inopportune times. Catching exceptions at higher levels is always possible. It is harder to re-throw an exception without losing information. Catch-all expressions that trap exceptions you didn’t anticipate are usually bad.

  • Software should fail noisily

    The worst thing software can do when encountering a fault is to continue silently. Unexpected faults should result in verbose exceptions that get logged and reported to an error tracking system.

Benefits

  • Less code

    Less code means less work and fewer places for bugs to hide. Error handling code is seldom used and often contains bugs of its own due to being poorly tested.

  • Less work

    Faster development in the short term. More predictable program behavior in the long term. Programs that simply crash when a fault is encountered are predictable. Simple restart mechanisms are easier to understand than specialized exception handling logic.

  • Reduced coupling

    Catching a specific exceptions is connascence of name. When you have a lot of exceptions that you are catching by name, your exception handling code is coupled to the code that raises the exceptions. If the code raising the exceptions changes you may have to change your exception handling code as well. Being judicious about handling exceptions reduces coupling.

  • Clarity

    Smaller programs are easier to understand, and the code more clearly indicate the programmer intent - unexpected faults are not handled by the code. Catch all expressions make the developers intent unclear to anyone reading the code by obscuring their assumptions about what could happen.

  • Better performance

    An uncaught exception is going to terminate the program or process almost immediately. Custom exception handling logic will use additional CPU cycles and may not be able to recover from the fault it encountered. Retry logic can also use a lot of resources and doesn’t always succeed.

Applying Let It Crash

Rather than try to explain more about the Let It Crash principles, I will show how they can be applied in other programming languages. While the application of these principles may vary from language to language. Let It Crash is applicable everywhere.

Let It Crash in Bash

I’m going to use Bash scripting as an example, because it’s something that many developers vaguely familiar with, and it’s radically different from Erlang as far as language semantics go.

Original Code

Here is a simple Bash script that reads from STDIN on the command line and uploads the input to Hastebin. I took this script from my own dotfiles.

#!/usr/bin/env bash

input_string=$(cat);
curl -X POST -s -d "$input_string" $HASTEBIN_URL/documents \
       | awk -v HASTEBIN_URL=$HASTEBIN_URL -F '"' '{print HASTEBIN_URL"/"$4}';

While this may seem like simple code there are a lot of potential faults that would cause unexpected behavior. There are at least three possible faults in this script that would result in unexpected behavior:

  1. If the cat command that reads data into the input_string variable fails the script will continue and will upload an empty snippet to Hastebin.

  2. If the curl command fails the awk command to parse out the Hastebin URL will still be executed. The awk command will not print out the snippet URL since there is no response body, and it might print out some other URL instead.

  3. If the HASTEBIN_URL environment variable is not set all of these commands will still be executed without it and will fail due to the empty variable.

Improved Code

This simple script can be greatly improved by setting some Bash flags. The improved code is shown below.

# Unoffical Bash "strict mode"
# http://redsymbol.net/articles/unofficial-bash-strict-mode/
set -euo pipefail
#ORIGINAL_IFS=$IFS
IFS=$'\t\n' # Stricter IFS settings

input_string=$(cat);
curl -X POST -s -d "$input_string" $HASTEBIN_URL/documents -v \
  | awk -v HASTEBIN_URL=$HASTEBIN_URL -F '"' '{print HASTEBIN_URL"/"$4}';

Before executing the commands we set -e / errexit to tell the script to exit when an non-zero (unsuccessful) exit code is returned by any command. This will the first and second issues pointed about above. We also set -u flag so any time an expression tries to use an unset variable the script exits with an error. The third flag we use is -o pipefail which tells Bash to use the exit code of the first unsuccessful command as the exit code for the whole pipeline, instead of using the exit code of the last command. This prevents a pipeline from returning a successful status code when a command in the pipeline fails. The script now exits if HASTEBIN_URL is not set and print a readable error message for the user. So the user knows what the script expects to be set by the environment.

Note About the Name

I don’t think Let It Crash is a good name for the philosophy Erlang has about handling faults. Yes, letting things crash is often the right thing to do, but it is a form of defensive programming (offensive programming actually) that allows software to recover to a known good state when faults are encountered. I also think defensive programming is a poor name for a design philosophy intended to improve software reliability as it seems to indicate the programmer should be defensive and proactively guard against unexpected faults. This is a topic for another blog post.

Elixir 1.9 Configuration Merging

Elixir configuration can be placed in several different files. Config is often stored in config/config.exs and often config.exs is setup to load environment specific configuration files like config/test.exs, config/prod.exs, and so on. And now in Elixir 1.9 runtime configuration can be placed in config/releases.exs.

Configuration Evaluation

These various configuration files are evaluated at different times. config/config.exs is evaluated at compile time. Runtime configuration in config/releases.exs is evaluated inside the release at runtime.

Configuration Merging

If we use multiple configuration files configuration data is merged recursively. We can observe how this merging works by specifying multiple configurations in the same configuration file with the same root key and then loading the configuration values from it.

# Write this to a file named config_test.exs
import Mix.Config

config :config_test,
  param1: :some_value,
  param2: :another_value,
  nested_param: [keyword: :list]

config :config_test, :nested_param, [another: :keyword]

config :config_test,
  param1: :new_value

This file makes three config calls to set the configuration for the :config_test application. These config calls will be evaluated in the order they are defined in the file, but what will the final configuration be?

Loading the configuration using Config.Reader in the iex shell shows us the final values:

iex> Config.Reader.read!("config_test.exs")

[
  config_test: [
    param1: :new_value,
    param2: :another_value,
    nested_param: [keyword: :list, another: :keyword],
  ]
]

Config.Reader recursively merges the configurations in order until they are all combined.

The first config call defines the base configuration. The third changes the value of the :param1 key to :new_value. Even keys inside :nested_param keyword list get merged with the existing keyword list. We can also see that the whole configuration itself is just a set of nested keyword lists.

Conclusion

Elixir merges configuration spread across multiple files in a reasonable way, and makes it easy to combine configuration from config/config.exs, config/releases.exs, and any other file containing configuration data without having to duplicate any of the keys and values.

But remember, configuration files aren’t always best. Application configuration is global and therefore prevents libraries and applications from being used with different configurations in the same release. Avoiding configuration files entirely is usually the best approach for library applications.

Another important change related to configuration is that mix new will no longer generate a config/config.exs file. Relying on configuration is undesired for most libraries and the generated config files pushed library authors in the wrong direction.