Setting up distributed compilations with sccache

I hate waiting for compilations. For day-to-day it may only be a minute or two, but once you start doing another task, the context switching distracts from what you were doing before and breaks everything up. Life is too short to be waiting for computers.

Obviously the first answer is to have a faster machine, but having a super fast laptop and a super fast PC all the time contributes to e-waste which I also hate. Some of my test devices for touch and tablet work are 5 year old Intel atom devices that I still sometimes need to compile on to fix things.

The solution is distributed compiling, using multiple computers to share the work.

Icecream or distcc used to be the tools back in the day, but they're both quite dated and have other issues.

There's a relatively new kid on the block, sccache. sccache primarily serves as a way of keeping your cached compiled assets around (think ccache), but also sharing them across users. Sharing cached assets requires exactly matching paths and dependencies and compilers so it's not that great for my needs; but it seems it would be perfect for flatpak and immutable cases.

But sccache also has another trick up it's sleeve; distributed compilation.

The documentation for sccache is a bit overwhelming packed with enterprise level features https://github.com/mozilla/sccache/blob/main/docs/Distributed.md.
It wasn't that clear how to do something simple, so I thought it might be useful to share how I got things working nicely for me.

Installation

Sccache is probably available in your distribution, note that not all distros include the shared compiler part.

If not you can download a version from https://github.com/mozilla/sccache/releases/download

The nice part is it's statically linked with no external dependencies so you can throw it on anything, even if it's immutable like KDE Linux or even a Steamdeck or two.

The parts

Scheduler

The scheduler is the key part of the operation; the client sends requests to the scheduler which in return replies with a list of schedulers that can recieve payloads distributing them accordingly.

Simple

Create a file as follows:

scheduler.conf

public_addr = "0.0.0.0:10600"
client_auth = { type = "token", token = "dave_is_great" }
server_auth = { type = "token", token = "dave_is_great" }

sccache-dist scheduler --config scheduler.conf

Docker

As I want the scheduler always on, I run it on a small home-server, where I prefer to docker-ise everything.

docker-compose:

version: '3.8'

services:
  sccache-scheduler:
    build:
      context: .
      dockerfile: Dockerfile
      args:
        VERSION: "v0.10.0"
        SHASUM: bbf2e67d5e030967f31283236ea57f68892f0c7b56681ae7bfe80cd7f47e1acc
    image: sccache:latest
    container_name: sccache-scheduler
    ports:
      - "10600:10600"
    volumes:
      - ./scheduler.conf:/scheduler.conf:ro
    entrypoint: ["/usr/local/bin/sccache-dist", "scheduler", "--config", "/scheduler.conf"]
    restart: unless-stopped
    environment:
      - SCCACHE_NO_DAEMON=1

Dockerfile

FROM alpine:3.9.2

ARG VERSION
ARG SHASUM

RUN apk add clang
RUN apk add curl
RUN apk add --no-cache bubblewrap

RUN curl -L https://github.com/mozilla/sccache/releases/download/$VERSION/sccache-dist-$VERSION-x86_64-unknown-linux-musl.tar.gz > sccache-dist.tar.gz \
    && tar xf sccache-dist.tar.gz \
    && mv sccache-dist-$VERSION-x86_64-unknown-linux-musl/sccache-dist /usr/local/bin/sccache-dist \
    && rm -r sccache-dist.tar.gz sccache-dist-$VERSION-x86_64-unknown-linux-musl

RUN apk del curl

ENTRYPOINT ["/usr/local/bin/sccache"]

and the scheduler.conf as above.

Servers (build machines)

This is the part that does the building.
The config takes the address of the scheduler, but also the server's own IP address as a sort of "callback" address.

It needs to run as root in order to have capabilities to set up sandboxing and restrict it back down to something lower than where we started. The sitaution is a bit silly, but it is what it is.

server.conf

public_addr = "192.168.1.YOURIPADDRESS:10501"
scheduler_url = "http://192.168.1.SCHEDULERIPADDRESS:10600"
cache_dir = "/tmp/toolchains"
scheduler_auth = { type = "DANGEROUSLY_INSECURE" }

[builder]
type = "overlay"
# The directory under which a sandboxed filesystem will be created for builds.
build_dir = "/tmp/build"
# The path to the bubblewrap version 0.3.0+ `bwrap` binary.
bwrap_path = "/usr/bin/bwrap"

Then you can run sudo sccache-dist server --config server.conf

Systemd

As I want this running constantly on my desktop and laptop I use a systemd service.

/etc/systemd/system/sccache.service
[Unit]
Description=sccache scheduler
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
ExecStart=/usr/bin/sccache-dist server --config /etc/sccache/server.conf
Restart=on-failure
RestartSec=5s

[Install]
WantedBy=multi-user.target

Clients

The final part is relatively simple, making local builds use the other build servers.

First we need to set up a config as follows:

.config/sccache/config

[dist]
scheduler_url = "http://192.168.1.SCHEDULERIPADDRESS:10600"
toolchains = []
toolchain_cache_size = 5368709120
auth = { type = "token", token = "dave_is_great" }

Cmake

Enabling requires just telling cmake to use the relevant wrapper with -DCMAKE_C_COMPILER_LAUNCHER=sccache -DCMAKE_CXX_COMPILER_LAUNCHER=sccache

kde-builder

And/or it to your kde-builder as follows:

.config/kde-builder.yaml

global:
  cmake-options: -GNinja -DCMAKE_BUILD_TYPE=Debug -DCMAKE_EXPORT_COMPILE_COMMANDS=ON  -DCMAKE_CXX_COMPILER=clang++ \
    -DCMAKE_C_COMPILER_LAUNCHER=sccache -DCMAKE_CXX_COMPILER_LAUNCHER=sccache 

and reconfigure the project(s)

Quirks and workarounds

The most annoying quirk is servers need a consistent IP address within your network. Servers register to the scheduler with a fixed IP address. When clients queue jobs they are given the IP addresses back from the scheduler and are expected to then talk to the build server(s) directly. Using hostnames doesn't work.

Note also if you change your local .config/sccache you may need to run sccache --stop-server on the client to relaunch. Confusingly in this case 'server' refers to a process on the client that compile jobs are thrown at.

Debugging

sccache --dist-status will show the connected schedulers and how many total active jobs

Managing job count

By default ninja schedules the same number of jobs as you have local cores for. It's unaware of the many other cores you have.
I have this in my zshrc to set the number of jobs to the total number on the scheduler at that time.

function getSccacheCPUs() {
        sccache --dist-status  | jq '."SchedulerStatus"[1].num_cpus'
}
alias ks='MAKEFLAGS=-j${getSccacheCPUs} kde-builder'

Comparison to icecream

Cons:

  • The setup process is a lot more laborious than icecream's magic turn-up-and-compile structure
  • No cool UI to see how many tasks are being compiled.

Pros:

  • It's very robust to network issues. If the scheduler is down or no servers are available things build locally extremely transparently
  • It also has it's own equivalent of a local 'ccache' which means you don't need to worry about daisy-chaing compilers wrappers to still have cached output.
  • It's actively maintained, the last meaningful commit in icecream is years ago