Independent Embedded Systems Engineer & C++ Developer

Running a web application in production means more than getting it to compile. It means encrypted credentials that never touch disk in plaintext, a process supervisor that detects hangs — not just crashes — and an update path that doesn't require SSH and prayer. This post walks through deploying FFS on Fedora CoreOS: why CoreOS, how the deployment pipeline works, and the trade-offs involved.

Why Fedora CoreOS

CoreOS is an operating system designed for one job: running containers. There is no package manager in the traditional sense, no /usr/local/bin full of hand-compiled tools, no configuration drift. The entire system state is declared in a single Ignition file at install time. If a node dies, you provision a new one from the same file and get an identical machine.

This philosophy aligns well with a platform like FFS. The application runs inside a container image that carries its own Wt runtime, shared libraries, and compiled components. The host provides the kernel, the container runtime, and systemd — nothing else.

Advantages:

Immutable infrastructure eliminates "works on my machine" surprises. Every CoreOS node provisioned from the same Ignition file is identical: same firewall rules, same directory layout, same systemd units, same SSH keys. There is no accumulated state from months of dnf install and forgotten config tweaks.

Automatic updates keep the OS patched without operator intervention. CoreOS uses an A/B partition scheme — it downloads the update to the inactive partition, reboots into it, and rolls back automatically if the boot fails. The application container is unaffected because it carries its own dependencies.

Minimal attack surface comes from the read-only root filesystem and the absence of a package manager. There is no compiler, no curl | bash risk, no forgotten development headers. The only writable paths are explicitly mounted data directories.

Trade-offs:

Debugging on a CoreOS node is harder than on a full Linux workstation. There is no gdb, no strace (without toolbox), no text editor beyond vi. When something goes wrong at 2 AM, you are reading journalctl output and thinking carefully, not attaching a debugger.

The Ignition-based provisioning model means configuration changes require either re-provisioning the node or manually editing systemd units — there is no apt-get install or dnf update for host-level changes. This is a feature, not a bug, but it demands discipline.

SELinux is enforcing by default. Container volume mounts interact with SELinux labels, and mixing rootless Podman with bind-mounted host directories can produce permission issues that are invisible until runtime. The :Z volume flag handles relabeling, but it changes file ownership in ways that can surprise you if you share directories between containers with different UID namespaces.

The Deployment Pipeline

FFS deployment on CoreOS follows a three-stage pipeline: prepare, provision, deploy. Each stage has a dedicated script that encapsulates the complexity.

Stage 1 — Prepare (Developer Machine)

ffs-prepare.sh runs on the developer's workstation, not on the server. It reads the project configuration, collects SSH keys, SMTP credentials, and TLS certificate passwords, then generates two files: a Butane configuration (ffs.bu) and its compiled Ignition counterpart (ffs.ign).

$ bash deploy/coreos/ffs-prepare.sh --project serbest --name www

The Butane template uses placeholders (@PROJECT@, @INSTANCE_NAME@, @HTTPS_PORT@) that are filled from the project's .env file and interactive prompts. The result is a complete machine specification: systemd units, firewall rules, directory permissions, user accounts, and SSH authorized keys.

The TLS certificate password is encrypted using systemd-creds — it is stored on disk in encrypted form and only decrypted at service startup into a temporary RAM-backed directory.

Stage 2 — Provision (CoreOS Install)

The generated ffs.ign is fed to the CoreOS installer. This is a one-time operation per machine (or per re-provision):

$ coreos-installer install /dev/sda --ignition-file ffs.ign

After reboot, the machine comes up with everything in place: the ffs system user, the project directory tree, the systemd service units, and the firewall configuration. No SSH login is needed for initial setup.

Stage 3 — Deploy (Image Transfer)

ffs-deploy.sh transfers the container image and project files to the CoreOS node. On first deployment, it also runs the server setup script and encrypts the TLS credential:

$ bash deploy/coreos/ffs-deploy.sh 192.168.1.10 \
    --name www --user core --first-deploy

Subsequent updates are simpler — just the image and changed project files:

$ bash deploy/coreos/ffs-deploy.sh 192.168.1.10 --name www

The deploy script handles image transfer via podman save / podman load, project file synchronization via rsync, and service restart via systemctl restart.

TLS Certificate Management

FFS uses encrypted TLS private keys. The private key is encrypted with AES-256-CBC during the build process and stored as server.key.enc. At container startup, the entrypoint script decrypts it into /tmp/secrets/ (a tmpfs mount — RAM only, never touches disk) using the CERT_PASS environment variable, which itself comes from a systemd encrypted credential.

Certificate Rotation

ffs-cert.sh provides certificate lifecycle management:

# View current certificate status (local, server, container)
$ bash deploy/coreos/ffs-cert.sh status 192.168.1.10 --name www

# Rotate HTTPS certificate
$ bash deploy/coreos/ffs-cert.sh rotate-https 192.168.1.10 --name www

# Rotate SSH access keys
$ bash deploy/coreos/ffs-cert.sh rotate-ssh 192.168.1.10 --name www

The status command shows certificate expiry dates, fingerprints, and warns when certificates are within 30 days of expiration. rotate-https automatically backs up the existing certificate, deploys the new one, re-encrypts the credential, and restarts the service.

Process Supervision and Health Monitoring

One of the hardest lessons from early CoreOS deployments was discovering that a running process is not necessarily a healthy process. FFS originally relied on a simple fork-based watchdog: parent forks child, waits, restarts on crash. This catches segfaults and unhandled exceptions but not hangs — a blocked SMTP connection or a deadlocked event loop keeps the process alive while serving nothing.

The current supervision architecture uses pipe-based health probes:

parent (supervisor)
  └── child 0  ──pipe──→  parent reads 'H' heartbeats
  └── child N  ──pipe──→  (future workers)

The child writes a heartbeat byte to a pipe every 10 seconds. The parent poll()s all pipes — no heartbeat for three consecutive probe intervals means the child is hung, and the parent kills it with SIGTERM (with a 5-second grace period before SIGKILL) and restarts.

For systemd integration, the parent sends sd_notify("WATCHDOG=1") at each probe interval when all children are healthy. If any child is unhealthy, the heartbeat is withheld and systemd's own WatchdogSec timer eventually kills the entire container — a second layer of defense.

The sd_notify implementation is self-contained: a direct sendmsg() to the NOTIFY_SOCKET Unix datagram socket, with no libsystemd dependency. This keeps the container image minimal and works on any base image.

Command-Line Flags

./ffs --sd-notify     # systemd integration (CoreOS production)
./ffs --watchdog      # standalone supervision (Docker, dev containers)
./ffs                 # bare process (GDB, IDE, valgrind)

Deploying Updates

After the initial deployment, updates follow a predictable workflow:

# 1. Build the new production image (developer machine, inside dev container)
ffscm rebuild production --project serbest --compiler intel

# 2. Deploy to the CoreOS node
bash deploy/coreos/ffs-deploy.sh 192.168.1.10 --name www

# 3. Verify
bash deploy/coreos/ffs-cert.sh status 192.168.1.10 --name www

The deploy script transfers only the changed image layers and project files. The service is restarted automatically. If the new version fails to start (quick failure detection: three crashes within 5 seconds), the watchdog gives up and systemd's Restart=on-failure takes over with increasing back-off.

Lessons Learned

systemd's WatchdogSec requires active participation. Setting WatchdogSec=120s without sending heartbeats causes systemd to kill your service every two minutes. FFS learned this the hard way — the server appeared to restart randomly until we correlated the restart interval with the watchdog timeout.

Shell expansion does not happen in systemd ExecStart. The credential pattern $(cat ${CREDENTIALS_DIRECTORY}/cert_pass) requires a /bin/sh -c wrapper. Without it, the literal string $(cat ...) is passed to the container, and the TLS key decryption fails silently.

What's Next

The current deployment model handles single-node production well. Future improvements include Let's Encrypt integration for automated certificate renewal, multi-worker support (the N-child watchdog infrastructure is already in place), and a deployment health check that verifies the HTTPS endpoint responds correctly before declaring the update successful.

CoreOS provides a solid foundation for running FFS in production. The immutability guarantees, automatic OS updates, and systemd integration make it a natural fit for a platform that values operational reliability. The trade-off — less flexibility, steeper debugging curve — is worth it for the confidence that every deployment is reproducible and every restart is supervised.

Blog

Deploying FFS on Fedora CoreOS — From Zero to Production