Coolify pulls from GitHub and runs docker compose build — no guaranteed
env vars like SOURCE_COMMIT. Previous approach relied on ARG/ENV
passthrough that silently defaulted to "dev".
Fix: install git in the builder stage, COPY .git into context, run
git describe --tags --always to derive the version, write it to
/app/server/.bf-version. version.ts reads this file as a fallback
between env vars and the "dev" literal.
Chain: BF_SERVER_VERSION env → BF_BUILD_VERSION env → .bf-version file
→ COOLIFY_GIT_COMMIT env → SOURCE_COMMIT env → "dev".
Also: fix .gitignore for rauc-signing/ (was under wrong path).
Kernel dropped to initramfs because root=LABEL=BF_ROOT_A in cmdline.txt
but the ext4 filesystem had no label set (pi-gen's default is unlabeled).
dd copies raw bytes — any label must be set on the standalone file BEFORE
writing into the output image.
Add e2label BF_ROOT_A on rootfs.ext4 + fatlabel BF_BOOT_A / BF_BOOT_B
on each bootfs copy after patching cmdline.txt but before dd.
losetup -fP partition scanning failed on CI runner ("failed to open
partition 1"). Rewrite to parse partition start/size from sfdisk -J
(JSON output) via jq, then dd with skip/seek at exact sector offsets.
Only uses losetup for individual file images (selector.vfat, rootfs,
bootfs) where partition scanning isn't needed.
Also: add jq to CI apt install, drop xz compression from -9 to -6
(faster, still ~85% ratio on rootfs), free source image earlier to
avoid disk exhaustion on runners with tight scratch.
- Bake @flowfuse/node-red-dashboard into Node-RED Docker image
- Fire-and-forget syncDashboardsFromNodered() on GET /admin/entities
so dashboard tabs appear without manual sync button click
New kiosk/src/onvif_events.rs: for each ONVIF camera in the bundle,
creates a PullPoint subscription, polls every 3s, parses
NotificationMessage XML into structured JSON (topic + source key/values
+ data key/values + timestamp), and POSTs to /api/kiosk/event with
source_type=onvif + camera_id.
Forwards ALL event topics: motion, ANPR (LicensePlateRecognition),
line crossing, intrusion, digital input, analytics, tamper — everything
the camera exposes. Node-RED sorts what matters.
Subscription lifecycle:
- CreatePullPointSubscription with 60s InitialTerminationTime
- Renew every 55s before timeout
- Unsubscribe on bundle change / shutdown
- Auto-resubscribe on pull/renew failure (30s backoff)
- Generation tracking via Weak<()> so old workers self-terminate
when start() is called with a new bundle
WSSE PasswordDigest auth for SOAP calls — same scheme the server's
onvif.ts uses. sha1 crate added.
BundleCamera extended with onvif_host/port/username/password_encrypted
fields (server already ships them; kiosk just wasn't deserializing).
Gated by BF_ENABLE_ONVIF_EVENTS=1. Enabled by default in the pi-gen
image env file.
TODO: cluster-key-based decryption of onvif_password_encrypted. For
now relies on the RTSP URI having plaintext credentials embedded (which
the ONVIF import path already ensures via rtspWithCredentials).
Two image-side hardening pieces both small enough to ship together.
deploy/nftables/nftables.conf — single ruleset installed at /etc/nftables.conf.
Default-drop input. Allowed: loopback, established/related, ratelimited
ICMP, kiosk local API :18090 from RFC1918 / RFC4193 / link-local sources
only. SSH stays gated by sshd-disabled (image build sets enable-ssh: 0
and 01-run-chroot masks it); the firewall rule for :22 is left commented
in for triage scenarios. Forward dropped. Output left wide open — kiosk
needs to dial out to arbitrary RTSP cameras + the BF server (which may
live on the public internet) without explicit allowlisting.
deploy/systemd/betterframe-firstboot.{service,sh} — runs once per device
before betterframe-kiosk starts. Generates a 24-char unambiguous-glyph
password, applies via chpasswd, stores at /etc/betterframe/admin-password
(0400 root), and prints a banner to tty1 so an HDMI-attached operator
can transcribe it during the boot window before cage takes over the
screen. Marker at /var/lib/betterframe/.firstboot-complete prevents
re-run on subsequent boots. Without this, every kiosk built from the
same image shipped with bfadmin:betterframe — a single password leak
compromises the entire fleet.
Future follow-up: post the rotated password (encrypted with cluster_key)
to the BF server via heartbeat so admin UI can surface it. Not in this
commit; the local file + tty banner are the only retrieval paths today.
Phase 2b. Bake the runtime side of RAUC into the curated image so a
freshly-flashed kiosk can accept .raucb bundles immediately:
- Add `rauc` + `dosfstools` to the apt package list.
- Drop deploy/rauc/system.conf to /etc/rauc/system.conf (already declares
the A/B slot layout that repartition-image.sh produces).
- Drop deploy/rauc/betterframe-rauc-boot.sh to
/usr/local/sbin/betterframe-rauc-boot.sh — the custom bootloader
backend that flips the BF_BOOTSEL autoboot.txt to point at the
freshly-installed slot via Pi 5 tryboot.
- Drop deploy/rauc/ca-cert.pem (operator-supplied, committed) to
/etc/rauc/keyring.pem so rauc can verify CMS signatures. If the cert
isn't committed yet, image build emits a workflow warning and the
kiosk image installs but refuses every bundle — image still flashes,
just no OS OTA until the cert is committed.
- Enable BF_ENABLE_OS_OTA=1 in /etc/default/betterframe-kiosk so the
kiosk Rust consumer actually polls for bundles. Set to 0 to pin OS
version for a specific kiosk.
mark-good was already wired (deploy/systemd/betterframe-rauc-mark-good.{service,sh}).
The kiosk's heartbeat loop also calls `rauc status mark-good` as a
belt+suspenders backup; both are idempotent.
Phase 2a of OS OTA: post-process pi-gen output into a RAUC-compatible
A/B layout. New deploy/rauc/repartition-image.sh:
- Decompresses the stock pi-gen 2-partition image
- Extracts bootfs (vfat) + rootfs (ext4) blobs
- Compacts rootfs with resize2fs -M and grows back with 25% headroom
- Patches /etc/fstab inside rootfs to use LABEL=BF_BOOT_A /
LABEL=BF_ROOT_A / LABEL=BF_DATA (slot-agnostic; RAUC re-labels per
slot on install)
- Stamps /etc/betterframe/{os-version,os-compatibility} for the kiosk's
os_update.rs to read at runtime
- Builds two bootfs copies, each with cmdline.txt root= rewritten to
the matching ROOT slot
- Lays out 6 GPT partitions: BF_BOOTSEL (autoboot.txt with tryboot
pointing at boot_partition=2 / [tryboot] boot_partition=3), BF_BOOT_A,
BF_BOOT_B, BF_ROOT_A (populated), BF_ROOT_B (empty, RAUC fills on
first install), BF_DATA
- Recompresses with xz -T0
build-bundle.sh now takes the already-extracted slot images so the
.raucb bundle re-uses the exact same blobs that ship inside the A/B
initial-flash image — no duplication, no drift.
CI wires the repartition step between pi-gen output and the GitHub
Release upload. Ships the A/B image (not the stock pi-gen one).
Also: bump Blacksmith binary builders from 2/4 vCPU to 8 vCPU each.
Image job stays on GitHub's ubuntu-24.04-arm (Blacksmith arm kernel
6.5 doesn't ship binfmt_misc as a loadable module, which pi-gen-action's
defensive modprobe step still requires).
What's still pending:
- In-image RAUC install (rauc package + drop system.conf + CA cert
at /etc/rauc/keyring.pem). Without this, the image boots A/B-laid-
out but rauc install commands have no daemon to talk to.
- Admin UI for OS releases + rollouts (task #4).
Phase 1 of the OS OTA pipeline. Three pieces:
scripts/gen-rauc-signing-keys.sh — one-shot helper that issues an
Ed25519 X.509 CA + signing cert pair. Operator runs locally, commits
the CA cert (for embedding in kiosk image at /etc/rauc/keyring.pem),
stores the signing pair as GitHub Actions secrets
(BF_RAUC_SIGNING_CERT + BF_RAUC_SIGNING_KEY), keeps the CA private
key offline. RAUC verifies bundles against the keyring in the image.
deploy/rauc/build-bundle.sh — takes the pi-gen .img.xz, parses its
partition table with sfdisk, dd-extracts bootfs (vfat) + rootfs
(ext4) into a staging dir, renders manifest.raucm.in with version
+ git sha, runs `rauc bundle --cert= --key=` to produce a signed
.raucb. Verifies the bundle round-trips with `rauc info`.
build.yml gains two gated steps:
- "Build RAUC bundle": runs only when both signing secrets are set,
uploads .raucb as a release asset alongside the .img.xz.
- "Auto-import OS bundle into BF server": POSTs the GH release asset
URL to ${BF_AUTOIMPORT_URL}/api/admin/os/import so the server
pulls + stores the bundle. Mirrors the kiosk-binary auto-import
flow that already worked.
Compatibility string is `betterframe-rpi5-aarch64` (matches the value
already declared in deploy/rauc/system.conf). Channel passed through
from inputs (dev for master pushes, stable/beta for tags).
What's NOT in this commit:
- Pi image A/B partition layout (custom genimage / pi-gen patch)
- rauc package install + keyring drop in pi-gen stage
- Kiosk-side os_update.rs Rust consumer that polls /api/kiosk/os/check
- Admin UI for releases + rollouts
A bundle built today reaches /api/admin/os/import on the server but
isn't installable yet — kiosks have no consumer and no A/B layout.
That's the next 3 phases. Bundle production needs to be solid first
so the kiosk side can be tested against real artifacts.
BF_ENABLE_APP_OTA gates the periodic firmware check in ui.rs:449.
Our curated image is the place where it should be on — operator can
flip to 0 in /etc/default/betterframe-kiosk to pin a specific build.
Without this, a fresh image never picked up new dev releases even when
the auto-import secrets were configured.
WebView showed a grey/black half-square in the top-left — that's GDK's
"none" cursor rendered by WebKit's own surface, which ignores the
GTK-window CSS we set elsewhere. Inject a WebKit UserStyleSheet that
applies cursor:none !important to every page + frame at User priority,
overriding page-author CSS.
For the boot gap (cage start → first kiosk frame), set XCURSOR_SIZE=1
and WLR_NO_HARDWARE_CURSORS=1 in the systemd unit. SW fallback honors
the 1-pixel size; HW cursors don't, which is why a default arrow leaks
through on some Pi GPUs.
Kiosk's pipeline.rs needs the gtk4paintablesink element to bridge
GStreamer → GTK4 Picture. Plugin ships in Debian package
gstreamer1.0-gtk4 (separate from plugins-base/good/bad). Without it
ElementFactory::make returns Err → create_camera_pipeline returns None
→ ensure_warm returns None → every camera cell renders
"(no stream)". Add to both the pi-gen image stage and the
setup-pi-kiosk.sh installer.
Pi-gen doesn't auto-copy a sub-stage's files/ dir into the chroot. The
chroot script's install commands were reaching for /tmp/bf-files/... which
never existed. Add a host-side 00-run.sh that bulk-copies files/* into
ROOTFS_DIR/tmp/bf-files, then rename the chroot script to 01-run-chroot.sh
so it sorts AFTER the host copy ('-' < '.' bites you otherwise).
Reverts misdiagnosis. pi-gen defaults to trixie since the Debian 13
release, which has gtk4 4.14 + libwebkitgtk-6.0 stock — no backports
needed. Build container, kiosk gtk feature gate, and pi-gen target all
realigned to trixie.
Actual reason last image run failed: our custom stage was missing the
mandatory prerun.sh (pi-gen calls it to seed ROOTFS_DIR from the
previous stage) and the EXPORT_IMAGE marker file (signals 'bake an
image at the end of this stage'). Both added.
Asset upload now globs deploy/*.img.xz so any extra exports stage2
produces ship alongside our customised one.
New workflow .github/workflows/release-image.yml takes a tagged kiosk
release binary, layers it onto Raspberry Pi OS Trixie Lite via a custom
pi-gen stage, and publishes the resulting .img.xz back to the GitHub
Release.
Custom stage deploy/pi-gen/stage-betterframe-client/:
- 00-install-packages: cage, seatd, plymouth, gtk4 runtime, gstreamer,
libwebkitgtk-6.0, wlr-randr, ca-certificates
- 01-install-kiosk: drops the prebuilt kiosk binary, systemd unit,
cage PAM stack, firmware-rollback hook, plymouth theme. Creates
bfkiosk user, sets multi-user.target, masks all display managers,
purges piwiz, edits cmdline/config for the BF splash. Mirrors
setup-pi-kiosk.sh but baked into the image.
End state: rpi-imager → SD → boot → pairing screen on the HDMI display,
no operator setup steps. Kiosk auto-discovers server via discover_server()
(localhost → mDNS → frame-eu.betterportal.net).
Heavy build (~30-60 min on GH-hosted Ubuntu) so tag-push triggered, not
master. Workflow_dispatch also supports baking an existing release tag's
binary into a fresh image without re-tagging.
- server Dockerfile installs wget — bookworm-slim doesn't include it
by default, so the healthcheck CMD silently failed → Coolify marked
the container unhealthy.
- nodered healthcheck swapped to /nrdp/ (always 200 when runtime up)
via wget --spider; previous /nrdp/auth/login returned non-2xx when
adminAuth disabled.
- start_period bumped to 90s for nodered's flow load on smaller hosts.
- Kiosk discovery: cloud fallback now frame-eu.betterportal.net per
the managed-fleet endpoint.
Previous deploy left /data/settings.js as a DIRECTORY (Docker auto-mkdir
from a failed bind mount earlier). cp from non-root user then failed
'Permission denied' writing inside it.
Entrypoint now:
- Detects + rm -rf the stale directory
- Seeds /data/settings.js from /usr/src/bf-settings.js
- Chowns /data to node-red
- exec su-exec node-red:node-red to drop privileges before npm start
The /data named volume hides anything Dockerfile COPYs into /data, so
the previous CMD override pointing at /usr/src/bf-settings.js didn't
help — Node-RED's launch script still looks for /data/settings.js by
default, which doesn't exist after the volume overlays.
Solution: entrypoint wrapper copies /usr/src/bf-settings.js to
/data/settings.js on first boot when missing, then exec's npm start.
Subsequent boots keep the user-edited version in the volume.
Coolify deployments don't always carry the full source tree on disk
at the bind-mount source path. Mounting a missing file lets Docker
auto-create a directory at the target, which then fails to mount over
the file the image expects.
Fix: bake config files into the images themselves:
- Dockerfile.server COPYs deploy/docker/sec-config.yaml → /app/server/.
Env vars (BF_*) still override at runtime per env-overrides.ts.
- New Dockerfile.angie wraps nginx:alpine + baked betterframe.docker.conf.
- Dockerfile.nodered COPYs nodered-settings.js to /usr/src/bf-settings.js
(outside the /data volume) and uses --settings to point at it.
Compose drops the three bind mounts; volumes are now strictly
runtime state (DB + secrets, Node-RED flows). Users who want a
different sec-config still get full control via env overrides or
Coolify's Storage UI.
Coolify passes --project-directory <repo-root> so relative paths in
compose resolved from there, not from the compose file's directory.
context: ../.. then climbed to / and lstat /deploy failed.
Moving compose to repo root makes every relative path
project-dir-relative regardless of who's invoking compose. Local
'docker compose up' from repo root and Coolify's
--project-directory + -f both resolve identically.
Coolify users: update the resource's compose path to 'docker-compose.yml'
(was 'deploy/docker/docker-compose.yml'). Existing named volumes carry
over since the named: directive keeps them.
BF_DATA_VOLUME_NAME, NODERED_DATA_VOLUME_NAME, BF_HOST_PORT keep the
compose public while letting per-deployment specifics (host paths,
multiple staging/prod instances on one host, alternate edge ports)
land in Coolify's env tab. Defaults preserve current behaviour.
Adds an OS + dist upgrade step before the BetterFrame install logic so
re-running the script keeps the host current. Uses
--force-confdef --force-confold
so package maintainer scripts never block on prompts, and follows with
autoremove + autoclean. Kernel/libc updates set /var/run/reboot-required
which the existing REBOOT_NEEDED guard picks up → auto-reboot at end.
BF_SKIP_UPGRADE=1 bypasses the upgrade for fast iteration.
Two ergonomics fixes so one invocation does the right thing:
1. After git pull, re-exec the script if the installer itself changed
in the pull. Previously you'd need a second run to pick up new
logic. BF_REEXEC=1 guard prevents loops.
2. Track REBOOT_NEEDED when cmdline.txt / config.txt get edited or
/var/run/reboot-required appears (apt kernel/libc update). At end
of run, auto-reboot after a 10s cancellable window. Override with
BF_NO_REBOOT=1.
WebKitGTK launches bubblewrap for its web-content process; bwrap refuses
to run when the parent process still carries unexpected CAP_* bits ("but
not setuid, old file caps config?"). Setting CapabilityBoundingSet= +
AmbientCapabilities= empty and NoNewPrivileges=yes gives bwrap a clean
caps slate to drop from, so the sandbox initialises and web/dashboard
cells render instead of crashing the kiosk.
Windows chmod doesn't propagate to git's mode bits, so the script
landed as 100644 (non-exec) and `./deploy/scripts/setup-pi-kiosk.sh`
gave "command not found" on the Pi. Update index to 100755 and add
.gitattributes to force LF on shell scripts / systemd units to head off
the related CRLF-shebang trap.
systemctl disable lets apt upgrades re-enable a DM. Mask everything
that could put a desktop on tty1. Purge piwiz + userconf-pi (the
"Welcome to Raspberry Pi" first-run wizard) and wipe /etc/motd +
/etc/update-motd.d so even on console nothing identifies as Pi.
systemd refuses to spawn the unit with code=216/GROUP when any group in
SupplementaryGroups= doesn't exist. Debian's seatd uses -g video — there
is no 'seat' group on the system. Removing it lets cage start; the video
group already covers seatd access.
Replace Pi rainbow + kernel boot text with a black BG + centered BF logo
during boot. Installer renders logo.png from the existing SVG asset via
rsvg-convert, drops a script-based plymouth theme, and appends the
quiet/splash flags to cmdline.txt + disable_splash=1 in config.txt.
cmdline.txt edits are idempotent: each flag only added if missing.
Expand setup-pi-kiosk.sh to be the one-and-only entry point: clones (or
git-pulls) the repo into the invoking user's home, installs Docker +
compose plugin + GTK/GStreamer/WebKit/cage/seatd + rustup (if missing),
brings up the docker-compose stack, builds the kiosk binary, and
provisions the bfkiosk user + cage PAM + systemd unit.
Every step is idempotent so re-running pulls latest, rebuilds, and
redeploys. SKIP_DOCKER / SKIP_KIOSK / SKIP_BUILD env flags let an
operator partition the work for kiosk-only or server-only hosts.
sudo -u <user> cargo fails when cargo lives in ~/.cargo/bin and root's
PATH doesn't carry it. Switch to sudo -u <user> -i sh -c so the user's
.profile / cargo env is sourced.