Introduction

Attic is a self-hostable Nix Binary Cache server backed by an S3-compatible storage provider. It has support for global deduplication and garbage collection.

Attic is still an early prototype and is looking for more testers. Want to jump in? Start your own Attic server in 15 minutes.

⚙️ Pushing 5 paths to "demo" on "local" (566 already cached, 2001 in upstream)...
✅ gnvi1x7r8kl3clzx0d266wi82fgyzidv-steam-run-fhs (29.69 MiB/s)
✅ rw7bx7ak2p02ljm3z4hhpkjlr8rzg6xz-steam-fhs (30.56 MiB/s)
✅ y92f9y7qhkpcvrqhzvf6k40j6iaxddq8-0p36ammvgyr55q9w75845kw4fw1c65ln-source (19.96 MiB/s)
🕒 vscode-1.74.2        ███████████████████████████████████████  345.66 MiB (41.32 MiB/s)
🕓 zoom-5.12.9.367      ███████████████████████████              329.36 MiB (39.47 MiB/s)

Goals

  • Multi-Tenancy: Create a private cache for yourself, and one for friends and co-workers. Tenants are mutually untrusting and cannot pollute the views of other caches.
  • Global Deduplication: Individual caches (tenants) are simply restricted views of the content-addressed NAR Store and Chunk Store. When paths are uploaded, a mapping is created to grant the local cache access to the global NAR.
  • Managed Signing: Signing is done on-the-fly by the server when store paths are fetched. The user pushing store paths does not have access to the signing key.
  • Scalabilty: Attic can be easily replicated. It's designed to be deployed to serverless platforms like fly.io but also works nicely in a single-machine setup.
  • Garbage Collection: Unused store paths can be garbage-collected in an LRU manner.

Tutorial

Let's spin up Attic in just 15 minutes (yes, it works on macOS too!):

nix-shell https://github.com/zhaofengli/attic/tarball/main -A demo

Simply run atticd to start the server in monolithic mode with a SQLite database and local storage:

$ atticd
Attic Server 0.1.0 (release)

-----------------
Welcome to Attic!

A simple setup using SQLite and local storage has been configured for you in:

    /home/zhaofeng/.config/attic/server.toml

Run the following command to log into this server:

    attic login local http://localhost:8080 eyJ...

Documentations and guides:

    https://docs.attic.rs

Enjoy!
-----------------

Running migrations...
Starting API server...
Listening on [::]:8080...

Cache Creation

atticd is the server, and attic is the client. We can now log in and create a cache:

# Copy and paste from the atticd output
$ attic login local http://localhost:8080 eyJ...
✍️ Configuring server "local"

$ attic cache create hello
✨ Created cache "hello" on "local"

Pushing

Let's push attic itself to the cache:

$ attic push hello $(which attic)
⚙️ Pushing 1 paths to "hello" on "local" (0 already cached, 45 in upstream)...
✅ r5d7217c0rjd5iiz1g2nhvd15frck9x2-attic-0.1.0 (52.89 MiB/s)

The interesting thing is that attic automatically skipped over store paths cached by cache.nixos.org! This behavior can be configured on a per-cache basis.

Note that Attic performs content-addressed global deduplication, so when you upload the same store path to another cache, the underlying NAR is only stored once. Each cache is essentially a restricted view of the global cache.

Pulling

Now, let's pull it back from the cache. For demonstration purposes, let's use --store to make Nix download to another directory because Attic already exists in /nix/store:

# Automatically configures ~/.config/nix/nix.conf for you
$ attic use hello
Configuring Nix to use "hello" on "local":
+ Substituter: http://localhost:8080/hello
+ Trusted Public Key: hello:vlsd7ZHIXNnKXEQShVnd7erE8zcuSKrBWRpV6zTibnA=
+ Access Token

$ nix-store --store $PWD/nix-demo -r $(which attic)
[snip]
copying path '/nix/store/r5d7217c0rjd5iiz1g2nhvd15frck9x2-attic-0.1.0' from 'http://localhost:8080/hello'...
warning: you did not specify '--add-root'; the result might be removed by the garbage collector
/nix/store/r5d7217c0rjd5iiz1g2nhvd15frck9x2-attic-0.1.0

$ ls nix-demo/nix/store/r5d7217c0rjd5iiz1g2nhvd15frck9x2-attic-0.1.0/bin/attic
nix-demo/nix/store/r5d7217c0rjd5iiz1g2nhvd15frck9x2-attic-0.1.0/bin/attic

Note that to pull into the actual Nix Store, your user must be considered trusted by the nix-daemon.

Access Control

Attic performs stateless authentication using signed JWT tokens which contain permissions. The root token printed out by atticd is all-powerful and should not be shared.

Let's create another token that can only access the hello cache:

$ atticadm make-token --sub alice --validity '3 months' --pull hello --push hello
eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJzdWIiOiJhbGljZSIsImV4cCI6MTY4MDI5MzMzOSwiaHR0cHM6Ly9qd3QuYXR0aWMucnMvdjEiOnsiY2FjaGVzIjp7ImhlbGxvIjp7InIiOjEsInciOjF9fX19.XJsaVfjrX5l7p9z76836KXP6Vixn41QJUfxjiK7D-LM

Let's say Alice wants to have her own caches. Instead of creating caches for her, we can let her do it herself:

$ atticadm make-token --sub alice --validity '3 months' --pull 'alice-*' --push 'alice-*' --create-cache 'alice-*'
eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJzdWIiOiJhbGljZSIsImV4cCI6MTY4MDI5MzQyNSwiaHR0cHM6Ly9qd3QuYXR0aWMucnMvdjEiOnsiY2FjaGVzIjp7ImFsaWNlLSoiOnsiciI6MSwidyI6MSwiY2MiOjF9fX19.MkSnK6yGDWYUVnYiJF3tQgdTlqstfWlbziFWUr-lKUk

Now Alice can use this token to create any cache beginning with alice- and push to them. Try passing --dump-claims to show the JWT claims without encoding the token to see what's going on.

Going Public

Let's make the cache public. Making it public gives unauthenticated users pull access:

$ attic cache configure hello --public
✅ Configured "hello" on "local"

# Now we can query the cache without being authenticated
$ curl http://localhost:8080/hello/nix-cache-info
WantMassQuery: 1
StoreDir: /nix/store
Priority: 41

Garbage Collection

It's a bad idea to let binary caches grow unbounded. Let's configure garbage collection on the cache to automatically delete objects that haven't been accessed in a while:

$ attic cache configure hello --retention-period '1s'
✅ Configured "hello" on "local"

Now the retention period is only one second. Instead of waiting for the periodic garbage collection to occur (see server.toml), let's trigger it manually:

atticd --mode garbage-collector-once

Now the store path doesn't exist on the cache anymore!

$ nix-store --store $PWD/nix-demo-2 -r $(which attic)
don't know how to build these paths:
  /nix/store/v660wl07i1lcrrgpr1yspn2va5d1xgjr-attic-0.1.0
error: build of '/nix/store/v660wl07i1lcrrgpr1yspn2va5d1xgjr-attic-0.1.0' failed

$ curl http://localhost:8080/hello/v660wl07i1lcrrgpr1yspn2va5d1xgjr.narinfo
{"code":404,"error":"NoSuchObject","message":"The requested object does not exist."}

Let's reset it back to the default, which is to not garbage collect (configure it in server.toml):

$ attic cache configure hello --reset-retention-period
✅ Configured "hello" on "local"

$ attic cache info hello
               Public: true
           Public Key: hello:vlsd7ZHIXNnKXEQShVnd7erE8zcuSKrBWRpV6zTibnA=
Binary Cache Endpoint: http://localhost:8080/hello
         API Endpoint: http://localhost:8080/
      Store Directory: /nix/store
             Priority: 41
  Upstream Cache Keys: ["cache.nixos.org-1"]
     Retention Period: Global Default

Because of Attic's global deduplication, garbage collection actually happens on three levels:

  1. Local Cache: When an object is garbage collected, only the mapping between the metadata in the local cache and the NAR in the global cache gets deleted. The local cache loses access to the NAR, but the storage isn't freed.
  2. Global NAR Store: Orphan NARs not referenced by any local cache then become eligible for deletion.
  3. Global Chunk Store: Finally, orphan chunks not referenced by any NAR become eligible for deletion. This time the storage space is actually freed and subsequent uploads of the same chunk will actually trigger an upload to the storage backend.

Summary

In just a few commands, we have:

  1. Set up a new Attic server and a binary cache
  2. Pushed store paths to it
  3. Configured Nix to use the new binary cache
  4. Generated access tokens that provide restricted access
  5. Made the cache public
  6. Performed garbage collection

What's next

Note: Attic is an early prototype and everything is subject to change! It may be full of holes and APIs may be changed without backward-compatibility. You might even be required to reset the entire database. I would love to have people give it a try, but please keep that in mind ️:)

For a less temporary setup, you can set up atticd with PostgreSQL and S3. You should also place it behind a load balancer like NGINX to provide HTTPS. Take a look at ~/.config/attic/server.toml to see what you can configure!

While it's easy to get started by running atticd in monolithic mode, for production use it's best to run different components of atticd separately with --mode:

  • api-server: Stateless and can be replicated.
  • garbage-collector: Performs periodic garbage collection. Cannot be replicated.

User Guide

Logging in

You should have received an attic login command from an admin like the following:

attic login central https://attic.domain.tld/ eyJ...

The attic client can work with multiple servers at the same time. To select the foo cache from server central, use one of the following:

  • foo, if the central server is configured as the default
  • central:foo

To configure the default server, set default-server in ~/.config/attic/config.toml.

Enabling a cache

To configure Nix to automatically use cache foo:

attic use foo

Pushing to the cache

To push a store path to cache foo:

attic push foo /nix/store/...

Other examples include:

attic push foo ./result
attic push foo /run/current-system

Admin Guide

This section is under construction.

This section describes how to set up and administer an Attic Server. For a quick start, read the Tutorial.

Deploying to NixOS

Attic provides a NixOS module that allows you to deploy the Attic Server on a NixOS machine.

Prerequisites

  1. A machine running NixOS
  2. (Optional) A dedicated bucket on S3 or a S3-compatible storage service
  3. (Optional) A PostgreSQL database

Generating the Credentials File

The HS256 JWT secret can be generated with the openssl utility:

openssl rand 64 | base64 -w0

Create a file on the server containing the following contents:

ATTIC_SERVER_TOKEN_HS256_SECRET_BASE64="output from openssl"

Ensure the file is only accessible by root.

Importing the Module

You can import the module in one of two ways:

  • Ad-hoc: Import the nixos/atticd.nix from the repository.
  • Flakes: Add github:zhaofengli/attic as an input, then import attic.nixosModules.atticd.

Configuration

Note: These options are subject to change.

{
  services.atticd = {
    enable = true;

    # Replace with absolute path to your credentials file
    credentialsFile = "/etc/atticd.env";

    settings = {
      listen = "[::]:8080";

      # Data chunking
      #
      # Warning: If you change any of the values here, it will be
      # difficult to reuse existing chunks for newly-uploaded NARs
      # since the cutpoints will be different. As a result, the
      # deduplication ratio will suffer for a while after the change.
      chunking = {
        # The minimum NAR size to trigger chunking
        #
        # If 0, chunking is disabled entirely for newly-uploaded NARs.
        # If 1, all NARs are chunked.
        nar-size-threshold = 64 * 1024; # 64 KiB

        # The preferred minimum size of a chunk, in bytes
        min-size = 16 * 1024; # 16 KiB

        # The preferred average size of a chunk, in bytes
        avg-size = 64 * 1024; # 64 KiB

        # The preferred maximum size of a chunk, in bytes
        max-size = 256 * 1024; # 256 KiB
      };
    };
  };
}

After the new configuration is deployed, the Attic Server will be accessible on port 8080. It's highly recommended to place it behind a reverse proxy like NGINX to provide HTTPS.

Operations

The NixOS module installs the atticd-atticadm wrapper which runs the atticadm command as the atticd user. Use this command to generate new tokens to be distributed to users.

Chunking

Attic uses the FastCDC algorithm to split uploaded NARs into chunks for deduplication. There are four main parameters that control chunking in Attic:

  • nar-size-threshold: The minimum NAR size to trigger chunking
    • When set to 0, chunking is disabled entirely for newly-uploaded NARs
    • When set to 1, all newly-uploaded NARs are chunked
  • min-size: The preferred minimum size of a chunk, in bytes
  • avg-size: The preferred average size of a chunk, in bytes
  • max-size: The preferred maximum size of a chunk, in bytes

Configuration

When upgrading from an older version without support for chunking, you must include the new [chunking] section:

# Data chunking
#
# Warning: If you change any of the values here, it will be
# difficult to reuse existing chunks for newly-uploaded NARs
# since the cutpoints will be different. As a result, the
# deduplication ratio will suffer for a while after the change.
[chunking]
# The minimum NAR size to trigger chunking
#
# If 0, chunking is disabled entirely for newly-uploaded NARs.
# If 1, all newly-uploaded NARs are chunked.
nar-size-threshold = 131072 # chunk files that are 128 KiB or larger

# The preferred minimum size of a chunk, in bytes
min-size = 65536            # 64 KiB

# The preferred average size of a chunk, in bytes
avg-size = 131072           # 128 KiB

# The preferred maximum size of a chunk, in bytes
max-size = 262144           # 256 KiB

FAQs

Does it replace Cachix?

No, it does not. Cachix is an awesome product and the direct inspiration for the user experience of Attic. It works at a much larger scale than Attic and is a proven solution. Numerous open-source projects in the Nix community (including mine!) use Cachix to share publicly-available binaries.

Attic can be thought to provide a similar user experience at a much smaller scale (personal or team use).

What happens if a user uploads a path that is already in the global cache?

The user will still fully upload the path to the server because they have to prove possession of the file. The difference is that instead of having the upload streamed to the storage backend (e.g., S3), it's only run through a hash function and discarded. Once the NAR hash is confirmed, a mapping is created to grant the local cache access to the global NAR. The global deduplication behavior is transparent to the client.

This requirement may be disabled by setting require-proof-of-possession to false in the configuration. When disabled, uploads of NARs that already exist in the Global NAR Store will immediately succeed.

What happens if a user uploads a path with incorrect/malicious metadata?

They will only pollute their own cache. Path metadata (store path, references, deriver, etc.) are associated with the local cache and the global cache only contains content-addressed NARs and chunks that are "context-free."

How is authentication handled?

Authentication is done via signed JWTs containing the allowed permissions. Each instance of atticd --mode api-server is stateless. This design may be revisited later, with option for a more stateful method of authentication.

On what granularity is deduplication done?

Global deduplication is done on two levels: NAR files and chunks. During an upload, the NAR file is split into chunks using the FastCDC algorithm. Identical chunks are only stored once in the storage backend. If an identical NAR exists in the Global NAR Store, chunking is skipped and the NAR is directly deduplicated.

During a download, atticd reassembles the entire NAR from constituent chunks by streaming from the storage backend.

Data chunking is optional and can be disabled entirely for NARs smaller than a threshold. When chunking is disabled, all new NARs are uploaded as a single chunk and NAR-level deduplication is still in effect.

Why chunk NARs instead of individual files?

In the current design, chunking is applied to the entire uncompressed NAR file instead of individual constituent files in the NAR. Big NARs that benefit the most from chunk-based deduplication (e.g., VSCode, Zoom) often have hundreds or thousands of small files. During NAR reassembly, it's often uneconomical or impractical to fetch thousands of files to reconstruct the NAR in a scalable way. By chunking the entire NAR, it's possible to configure the average chunk size to a larger value, ignoring file boundaries and lumping small files together. This is also the approach casync has taken.

You may have heard that the Tvix store protocol chunks individual files instead of the NAR. The design of Attic is driven by the desire to effectively utilize existing platforms with practical limitations, while looking forward to the future.

What happens if a chunk is corrupt/missing?

When a chunk is deleted from the database, all dependent .nar will become unavailable (503). However, this can be recovered from automatically when any NAR containing the chunk is uploaded.

At the moment, Attic cannot automatically detect when a chunk is corrupt or missing. Correctly distinguishing between transient and persistent failures is difficult. The atticadm utility will have the functionality to kill/delete bad chunks.

How is compression handled?

Uploaded NARs are chunked then compressed on the server before being streamed to the storage backend. On the chunk level, we use the hash of the uncompressed chunk to perform global deduplication.

                        ┌───────────────────────────────────►Chunk Hash
                        │
                        │
                        ├───────────────────────────────────►Chunk Size
                        │
                ┌───────┴────┐  ┌──────────┐  ┌───────────┐
 Chunk Stream──►│Chunk Hasher├─►│Compressor├─►│File Hasher├─►File Stream─►S3
                └────────────┘  └──────────┘  └─────┬─────┘
                                                    │
                                                    ├───────►File Hash
                                                    │
                                                    │
                                                    └───────►File Size

Reference

This section contains detailed listings of options and parameters accepted by Attic:

attic CLI

The following are the help messages that will be printed when you invoke any sub-command with --help:

attic

Attic binary cache client Usage: attic <COMMAND> Commands: login Log into an Attic server use Configure Nix to use a binary cache push Push closures to a binary cache cache Manage caches on an Attic server watch-store Watch the Nix Store for new paths and upload them to a binary cache help Print this message or the help of the given subcommand(s) Options: -h, --help Print help -V, --version Print version

attic login

Log into an Attic server Usage: attic login [OPTIONS] <NAME> <ENDPOINT> [TOKEN] Arguments: <NAME> Name of the server <ENDPOINT> Endpoint of the server [TOKEN] Access token Options: --set-default Set the server as the default -h, --help Print help -V, --version Print version

attic use

Configure Nix to use a binary cache Usage: attic use <CACHE> Arguments: <CACHE> The cache to configure. This can be either `servername:cachename` or `cachename` when using the default server. Options: -h, --help Print help (see a summary with '-h') -V, --version Print version

attic push

Push closures to a binary cache Usage: attic push [OPTIONS] <CACHE> [PATHS]... Arguments: <CACHE> The cache to push to. This can be either `servername:cachename` or `cachename` when using the default server. [PATHS]... The store paths to push Options: --no-closure Push the specified paths only and do not compute closures --ignore-upstream-cache-filter Ignore the upstream cache filter -j, --jobs <JOBS> The maximum number of parallel upload processes [default: 5] -h, --help Print help (see a summary with '-h') -V, --version Print version

attic watch-store

Watch the Nix Store for new paths and upload them to a binary cache Usage: attic watch-store [OPTIONS] <CACHE> Arguments: <CACHE> The cache to push to. This can be either `servername:cachename` or `cachename` when using the default server. Options: --ignore-upstream-cache-filter Ignore the upstream cache filter -j, --jobs <JOBS> The maximum number of parallel upload processes [default: 5] -h, --help Print help (see a summary with '-h') -V, --version Print version

attic cache

Manage caches on an Attic server Usage: attic cache <COMMAND> Commands: create Create a cache configure Configure a cache destroy Destroy a cache info Show the current configuration of a cache help Print this message or the help of the given subcommand(s) Options: -h, --help Print help -V, --version Print version

attic cache create

Create a cache. You need the `create_cache` permission on the cache that you are creating. Usage: attic cache create [OPTIONS] <CACHE> Arguments: <CACHE> Name of the cache to create. This can be either `servername:cachename` or `cachename` when using the default server. Options: --public Make the cache public. Public caches can be pulled from by anyone without a token. Only those with the `push` permission can push. By default, caches are private. --priority <PRIORITY> The priority of the binary cache. A lower number denotes a higher priority. <https://cache.nixos.org> has a priority of 40. [default: 41] --upstream-cache-key-name <NAME> The signing key name of an upstream cache. When pushing to the cache, paths signed with this key will be skipped by default. Specify this flag multiple times to add multiple key names. [default: cache.nixos.org-1] -h, --help Print help (see a summary with '-h') -V, --version Print version

attic cache configure

Configure a cache. You need the `configure_cache` permission on the cache that you are configuring. Usage: attic cache configure [OPTIONS] <CACHE> Arguments: <CACHE> Name of the cache to configure Options: --regenerate-keypair Regenerate the signing keypair. The server-side signing key will be regenerated and all users will need to configure the new signing key in `nix.conf`. --public Make the cache public. Use `--private` to make it private. --private Make the cache private. Use `--public` to make it public. --priority <PRIORITY> The priority of the binary cache. A lower number denotes a higher priority. <https://cache.nixos.org> has a priority of 40. --upstream-cache-key-name <NAME> The signing key name of an upstream cache. When pushing to the cache, paths signed with this key will be skipped by default. Specify this flag multiple times to add multiple key names. --retention-period <PERIOD> Set the retention period of the cache. You can use expressions like "2 years", "3 months" and "1y". --reset-retention-period Reset the retention period of the cache to global default -h, --help Print help (see a summary with '-h') -V, --version Print version

attic cache destroy

Destroy a cache. Destroying a cache causes it to become unavailable but the underlying data may not be deleted immediately. Depending on the server configuration, you may or may not be able to create the cache of the same name. You need the `destroy_cache` permission on the cache that you are destroying. Usage: attic cache destroy [OPTIONS] <CACHE> Arguments: <CACHE> Name of the cache to destroy Options: --no-confirm Don't ask for interactive confirmation -h, --help Print help (see a summary with '-h') -V, --version Print version

attic cache info

Show the current configuration of a cache Usage: attic cache info <CACHE> Arguments: <CACHE> Name of the cache to query Options: -h, --help Print help -V, --version Print version

atticd CLI

The following are the help messages that will be printed when you invoke any sub-command with --help:

atticd

Nix binary cache server Usage: atticd [OPTIONS] Options: -f, --config <CONFIG> Path to the config file -l, --listen <LISTEN> Socket address to listen on. This overrides `listen` in the config. --mode <MODE> Mode to run [default: monolithic] Possible values: - monolithic: Run all components - api-server: Run the API server - garbage-collector: Run the garbage collector periodically - db-migrations: Run the database migrations then exit - garbage-collector-once: Run garbage collection then exit - check-config: Check the configuration then exit --tokio-console Whether to enable tokio-console. The console server will listen on its default port. -h, --help Print help (see a summary with '-h') -V, --version Print version

atticadm CLI

The following are the help messages that will be printed when you invoke any sub-command with --help:

atticadm

Attic server administration utilities Usage: atticadm [OPTIONS] <COMMAND> Commands: make-token Generate a new token help Print this message or the help of the given subcommand(s) Options: -f, --config <CONFIG> Path to the config file -h, --help Print help -V, --version Print version

atticadm make-token

Generate a new token. For example, to generate a token for Alice with read-write access to any cache starting with `dev-` and read-only access to `prod`, expiring in 2 years: $ atticadm make-token --sub "alice" --validity "2y" --pull "dev-*" --push "dev-*" --pull "prod" Usage: atticadm make-token [OPTIONS] --sub <SUB> --validity <VALIDITY> Options: -f, --config <CONFIG> Path to the config file --sub <SUB> The subject of the JWT token --validity <VALIDITY> The validity period of the JWT token. You can use expressions like "2 years", "3 months" and "1y". --dump-claims Dump the claims without signing and encoding it --pull <PATTERN> A cache that the token may pull from. The value may contain wildcards. Specify this flag multiple times to allow multiple patterns. --push <PATTERN> A cache that the token may push to. The value may contain wildcards. Specify this flag multiple times to allow multiple patterns. --delete <PATTERN> A cache that the token may delete store paths from. The value may contain wildcards. Specify this flag multiple times to allow multiple patterns. --create-cache <PATTERN> A cache that the token may create. The value may contain wildcards. Specify this flag multiple times to allow multiple patterns. --configure-cache <PATTERN> A cache that the token may configure. The value may contain wildcards. Specify this flag multiple times to allow multiple patterns. --configure-cache-retention <PATTERN> A cache that the token may configure retention/quota for. The value may contain wildcards. Specify this flag multiple times to allow multiple patterns. --destroy-cache <PATTERN> A cache that the token may destroy. The value may contain wildcards. Specify this flag multiple times to allow multiple patterns. -h, --help Print help (see a summary with '-h') -V, --version Print version