Understanding Nix: A Purely Functional Approach to Package Management
This article is based on the presentation I made a while ago about Nix and NixOS. The presentation can be seen here.
Over the past few years, I've been exploring different approaches to managing software environments and deployments. Among them, Nix has stood out as a fascinating solution that takes a fundamentally different approach. In this post, I'll share what I've learned about Nix and why it might be worth your attention.
What is Nix?
Nix is simultaneously three interconnected things:
- A domain-specific language with functional programming characteristics
- A powerful package manager with unique properties
- The main pillar of NixOS, a Linux distribution built around Nix principles
Origins
Nix began as the PhD project of Eelco Dolstra, crystallized in these academic papers:
- Imposing a Memory Management Discipline on Software Deployment (2004)
- The Purely Functional Software Deployment Model (2005)
Initially, Nix was intended to be an alternative to traditional build systems like make
and package managers like rpm
. However, it has evolved into something much more comprehensive.
For those interested in a deeper historical perspective, I recommend watching Eelco Dolstra's talk on The Evolution of Nix.
Problems Nix Attempts to Solve
Traditional package managers face several persistent challenges that anyone who has managed systems will recognize:
- DLL hell - conflicting shared libraries causing application failures
- Destructive upgrades - installing a new version overwrites the previous one
- No rollbacks - difficult or impossible to return to previous states
- Not atomic - system left in inconsistent state if updates are interrupted
- Hard to prevent undeclared dependencies - applications might depend on libraries that aren't explicitly listed
These issues stem from the fundamental design of how software is installed and managed in conventional systems.
Understanding the Traditional Approach: Filesystem Hierarchy Standard
To understand what makes Nix different, let's first look at how traditional systems handle software installation. In standard Linux distributions, packages install files in fixed locations according to the Filesystem Hierarchy Standard:
# list of files installed on filesystem by wget pkg
Programs often rely on dynamically linked libraries that are shared across the system:
# dynamically linked libraries
Filesystem as Memory: A Novel Approach
One of Nix's breakthrough insights was drawing a parallel between memory management in programming languages and package management in operating systems. Dolstra visualized the filesystem as analogous to program memory.
Here's how concepts map between programming languages and deployment systems:
Programming Language Domain | Deployment Domain |
---|---|
memory | disk |
value, object | file |
address | path name |
pointer dereference | file access |
pointer arithmetic | string operations |
dangling pointer | path to absent file |
object graph | dependency graph |
calling constructed object with reference to other object | runtime dependency |
calling constructor with reference to other object, not stored | build-time dependency |
calling constructor with reference to other object, stored | retained dependency |
languages without pointer discipline (e.g. assembler) | typical Unix-style deployment |
languages with enough pointer discipline to support conservative garbage collection (e.g. C, C++) | Nix |
languages with full pointer discipline (e.g. Java, Haskell) | as-yet unknown deployment style not enabled by contemporary operating systems |
This model suggests viewing the file system as memory and package management as memory management. That reshapes how we can think about software deployment.
Isolation and Reliable Identification
One of the challenge in package management is avoiding file path collisions. When multiple packages want to install files to the same location, conflicts arise. Traditional approaches have limitations:
- Using component name and version alone isn't sufficient (what about different build options?)
- Random allocation would be inefficient and generate duplicates
The Hash Solution
Nix's approach is to compute a cryptographic hash of all inputs that affect a package, including:
- The sources of the components
- The build script that performed the build
- Any arguments or environment variables passed to the build script
- All build time dependencies, including compilers, linkers, libraries, standard Unix tools, shells, etc.
This hash becomes part of the path where the package is stored, ensuring that different versions or builds of the same package can coexist without conflicts.
The Nix Store
The cornerstone of Nix is the /nix/store directory, where all packages are stored using their unique hash identifiers. For example:
Notice how each library has its own unique path in the store. This isolation prevents conflicts between different versions of libraries and enables multiple versions to coexist peacefully.
Closures: Complete Dependency Tracking
A "closure" in Nix represents the complete set of dependencies required by a package. Using cryptographic hashes allows Nix to identify the exact build and runtime dependencies of any package:
We can even visualize the dependency graph of a package:
This will generate a graph showing all relationships between packages:
The beauty of closures is that they can be distributed across hosts, which enables powerful distributed build and cache systems:
This is the foundation for Nix's distributed builds and binary cache systems.
Garbage Collection
Since the Nix store can accumulate packages over time, Nix provides a garbage collection mechanism to reclaim space by removing packages that aren't referenced by any active profiles:
The Nix Expression Language
Nix has its own domain-specific language for defining packages and configurations. It has three key characteristics:
- Pure functional - functions always produce the same output given the same input
- Domain specific - designed specifically for describing packages and their dependencies
- Lazy evaluation - expressions are only evaluated when their results are needed
Syntax Basics
Let's explore some basic syntax of the Nix language:
# operators
1 + 2
> 3
[ 1 2 ] ++ [ 3 ]
> [ 1 2 3 ]
# let ... in ..., allow repeated use of variables in scope
# string interpolation
# nix-repl>
let
name = "World";
in
"hello name!"
> hello World!
# attribute set, attributes accessible by '.'
# with ..., expose attributes directly
let
attrs = { a = "str"; b = false; i = 3; };
in
with attrs; [ a attrs.b i ]
> [ "str" false 3 ]
We can merge attribute sets and use the inherit
keyword to bring variables into scope:
# merging attr sets
# dynamic typing
let
attrs1 = { a = "str"; b = false; };
attrs2 = { b = 10; i = 3; };
in
attrs1 // attrs2
> { a = "str"; b = 10; i = 3; }
# inherit, assign existing values in nested scope
let
x = { b = 1; };
y = 2;
z = false;
in
{
inherit x y;
z = z;
}
> { x = { b = 1; }; y = 2; z = false; }
Functions
In Nix, functions are nameless (lambdas) and always take exactly one argument:
# argument: function body
let
f = x: x + 1;
in
{
type = builtins.typeOf f;
result = f 1;
}
> { result = 2; type = "lambda"; }
# nested functions, x: (y: x + y)
let
f = x: y: x + y;
in
f 1 2
> 3
Functions can also take attribute sets as arguments, with optional default values:
# attr set as argument, defined attr must be passed
# ?, default value
# ... , extra attrs
# @name, named attr set
let
f = {a, b ? 1, ...}@args: a + b + args.c;
in
f { a = 1; c = 1; }
> 3
Domain-Specific Features
Nix has some unique behaviors due to its domain-specific nature. For example, division symbols in paths are interpreted literally:
6/3
> /Users/mskalski/org/6/3
let
r = 6/3;
in
builtins.typeOf r
> "path"
Lazy Evaluation
Nix uses lazy evaluation, meaning expressions are only evaluated when their results are needed:
let
f = builtins.fetchurl "http://127.0.0.1:8000/f";
b = 3;
in
b
> 3 # no request has been made to http server
In this example, even though we defined a function to fetch a URL, no network request was made because the result wasn't needed.
Practical Nix: What Can I Do With It?
Now that we understand the basics, let's explore some practical applications of Nix.
Task Shells: Ephemeral Environments
One of my favorite features is creating ephemeral shells with specific packages:
~ ❯ cowsay "nix is awesome!"
Unknown command: cowsay
~ ❯ nix-shell -p cowsay
[nix-shell:~]$ cowsay "nix is awesome!"
_________________
< nix is awesome! >
-----------------
\ ^__^
\ (oo)\_______
(__)\ )\/\
||----w |
|| ||
[nix-shell:~]$ exit
~ ❯ cowsay "nix is awesome!"
Unknown command: cowsay
This makes it incredibly easy to try out packages without permanently installing them. You can even create ad-hoc environments with specific Python modules:
)
Profiles: Persistent Environments with History
Profiles allow you to maintain persistent environments with full rollback capability and atomic updates:
# Install btop package in user environment (new generation)
# Compare changes between generations
# Revert to previous generation
Flakes: Reproducible Definitions
Flakes introduce flake.nix
and flake.lock
files to provide clear definitions of inputs and their versions. Think of them as similar to package.json and package-lock.json in the Node.js ecosystem, but more powerful.
Here is a short recording of me playing with nix flakes:
Composing Projects
Flakes can produce a variety of outputs: binaries, container images, development environments, and more:
)
)
)
)
)
)
)
)
)
)
)
)
Building a binary from a flake is straightforward:
# by default it produce 'result' symlink in current directory
|
[
{
}
You can also build container images:
# load image to local docker instance
# but docker is not needed
Development Shells
Flakes can define development environments where all dependencies for your application are available, providing a consistent experience for all developers:
These development shells are typically defined in the flake.nix file:
# flake.nix
;
For an even more seamless experience, you can use direnv, which many code editors also understand:
Overlays
If your project depends on a specific version of a system library or needs extra patches, you can easily modify packages at your project level using overlays:
NixOS: One System to Rule Them All
While Nix itself is a powerful package manager, NixOS takes the concept further by applying the same principles to the entire operating system configuration.
What Makes NixOS Special?
NixOS is fully integrated with Nix, both from packages and configuration standpoints. By default, it uses "channels" as a source of package versions:
- Stable channels are released every six months (for example, 22.11, 23.05)
- Unstable channel is a rolling release with the latest packages
Declarative System Configuration
The entire system can be described through declarative configuration. This means you can:
- Store your configuration in a repository
- Make it a flake and describe multiple hosts as separate outputs
- Share common configurations between hosts, making your setup more modular
Safe System Changes
Before applying changes to your system, you can test them in a virtual machine:
# will start local vm with current system configuration
Every package installation in the system profile creates a new "generation" and an entry for it in the bootloader. If your system doesn't start after an upgrade, you can simply boot from a previous generation.
If you need to reinstall your system, you can run from a live CD and restore your system from an existing configuration with a single command, or generate a disk image ahead of time.
Alternatives for Other Platforms
If you're not ready to switch to NixOS, there are options for other operating systems:
- nix-darwin tries to replicate NixOS behavior on macOS
- Home Manager can be used standalone or as a module for NixOS or nix-darwin and provides a rich library of software configurations
Additional Resources
If you're interested in learning more about Nix, here are some valuable resources:
For those interested in build systems more generally:
Conclusion
Nix represents a paradigm shift in how we think about package management and system configuration. By treating the filesystem as memory and applying functional programming principles to deployment, it solves many long-standing issues in software management.
While the learning curve can be steep, the benefits - reproducibility, atomic upgrades, rollbacks, and isolation - make it worth considering for both development environments and production systems.