Phanes Case Study | Sean Walsh

Problem and Motivation

I was writing the same Ansible playbooks every time I spun up a new VPS. New DigitalOcean droplet, SSH in, install Docker, set up UFW, configure fail2ban, create a deploy user with SSH keys, enable unattended upgrades. Same twenty steps, same YAML, slightly different config values. And every time I had to remember which Ansible Galaxy collections I needed, whether my local Python version matched what the playbook expected, and why pip install ansible was broken again.

I wanted something simpler. A single binary I could scp onto a fresh server, point at a YAML config, and have the box ready in five minutes. No Python runtime, no collections, no inventory files. Just the stuff I actually configure on every VPS, packaged into modules I can compose with profiles.

Architecture Overview

Phanes compiles to one binary with zero runtime dependencies. You drop it on a server, give it a config file and a profile name, and it runs through a list of modules in order. Each module knows how to check whether it's already been applied (IsInstalled()) and how to apply itself (Install()). The runner calls IsInstalled() first and skips anything already done. That's the entire idempotency model, no state file, no external tracking.

The module system is the whole thing. Every module implements a 4-method interface: Name(), Description(), IsInstalled(), and Install(). The runner loops through the requested modules, checks each one, installs what's missing, and prints a summary table at the end with color-coded status. 14 modules right now: baseline system settings, user creation, SSH hardening, UFW/fail2ban, Docker, Postgres, Redis, Caddy, Nginx, Netdata monitoring, swap, auto-updates, dev tools, and Tailscale.

Profiles sit on top as named bundles. phanes --profile dev gives you baseline + user + security + swap + updates + docker + monitoring + devtools. phanes --profile web swaps devtools for caddy. You can combine a profile with extra modules if you need web plus postgres. Config values like SSH port, swap size, and Postgres version all come from a single YAML file with sensible defaults for anything you skip.

Key Technical Decisions and Tradeoffs

The module interface being only four methods was deliberate. I could've added DependsOn(), Rollback(), Validate(). But the whole point is that writing a new module should take 30 minutes, not a day. Implement the interface, register it in main.go, done. The tradeoff is that module ordering is implicit in profiles rather than declared as a dependency graph, so if someone builds a custom module list in the wrong order, things break. Profiles handle the common cases and the docs cover the rest. Good enough.

Go's //go:embed directive turned out to be perfect for config file templates. The security module embeds sshd_config.tmpl and jail.local.tmpl directly into the binary, renders them with text/template at runtime using values from the YAML config, and writes them to disk. No external template files to ship, no path resolution headaches.

//go:embed sshd_config.tmpl
var sshdConfigTemplate string

//go:embed jail.local.tmpl
var jailLocalTemplate string

func (m *SecurityModule) Install(cfg *config.Config) error {
    rendered, err := renderTemplate(sshdConfigTemplate, cfg.Security)
    if err != nil {
        return fmt.Errorf("failed to render sshd config: %w", err)
    }
    exec.WriteFile("/etc/ssh/sshd_config", rendered, 0600)
    // ...
}

The dependency story is something I'm genuinely happy about. Three external packages total: cobra for CLI parsing, zerolog for structured logging, and gopkg.in/yaml.v3 for config parsing. That's the entire go.mod. I didn't want to fight dependency upgrades or security advisories in transitive deps I've never heard of.

The biggest tradeoff is Ubuntu/Debian only. Every module shells out to apt-get, and the detection logic assumes systemd and ufw. Supporting RHEL or Alpine would mean abstracting the package manager and adding conditionals to every module. Not worth it when almost every VPS provider defaults to Ubuntu anyway.

Screenshots and Video

No screenshots yet. Planning to capture a provisioning run showing the module summary table with color-coded install/skip/error status.

Tech Stack with Rationale

Go 1.25 for the single-binary story. No runtime, no interpreter, no "install Go on the target server." Cross-compilation to linux/amd64 and linux/arm64 is built into the toolchain, so the release pipeline is just GOOS=linux GOARCH=arm64 go build and you've got a working ARM binary.
Cobra for CLI flag parsing. It's the standard Go CLI library and it handles --profile, --modules, --config, --dry-run, and --list without me writing argument parsing code. Does what it says.
zerolog for structured, colored terminal output. Green for installed, yellow for skipped, red for failures. I wanted provisioning output to be scannable at a glance, and zerolog's level-based coloring handles that.
YAML for configuration because it's what people expect for server config. I considered TOML but YAML is more familiar to anyone who's touched Ansible or Docker Compose. Path of least resistance.

Challenges and Learnings

Testing system-level provisioning modules was the hardest part of the whole project. Unit tests can mock the exec package, but that only tests logic, not whether the actual apt-get install docker-ce sequence works on a real system. I ended up with three test tiers: unit tests with mocked commands, E2E tests in Docker containers (using docker-compose.test.yml with an Ubuntu 22.04 image), and full VM tests with Vagrant for modules that need real systemd. The Docker E2E tests are fast but can't test anything that requires systemctl. The Vagrant tests are slow, full VM boot, but they're the only way to verify the security module actually configures a firewall. Managing all three tiers is annoying but I don't see a way around it for a tool that writes to /etc/ssh/sshd_config.

The baseline module's IsInstalled() check taught me you can't assume timedatectl exists. Docker containers don't run systemd, so you fall back to reading /etc/timezone directly. Same dual-path pattern shows up in a few modules. Every module that touches systemd has this "am I in a container?" branch that I didn't expect to need.

The error continuation strategy is something I'm still not sure I got right. If the security module fails, should the runner keep going and install Docker anyway? Right now it collects errors and reports them all at the end. The argument for stopping is that running an unsecured server with Docker exposed is worse than stopping. The argument for continuing is that partial provisioning is more useful than nothing. I went with continue-on-error, but it's a reasonable debate.

Module registration in main.go is manually calling r.RegisterModule() for all 14 modules. Go's init() pattern could auto-register them, but I preferred the explicit approach where you see exactly what's loaded in one place. It does mean every new module requires touching main.go. Minor annoyance for the clarity it gives you.