25% was too tight — firmware OTA writes to /opt/betterframe/kiosk/
on the rootfs and fills it. 50% headroom gives enough space for
the kiosk binary download + swap.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- PARTUUID in cmdline.txt and fstab must be lowercase (initramfs
does case-sensitive match)
- BF_DATA partition now formatted as ext4 with label (was zeroed)
- Remove resize flag from cmdline.txt (breaks GPT layout)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Desktop purge was using wildcard lx* which removed libwlroots
and cage as dependency. Now uses specific package names +
apt-mark manual cage to protect it from autoremove.
2. Per-user cursor theme for bfkiosk (~/.icons/default/index.theme).
3. Repartition disables auto_initramfs in config.txt (initramfs
cant resolve LABEL= roots). Also handles root=/dev/* format
in cmdline.txt sed replacement.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pi 5 bootloader needs config.txt on partition 1. Old layout had
BF_BOOTSEL there with only autoboot.txt. Now 5 partitions:
BF_BOOT_A(1), BF_BOOT_B(2), BF_ROOT_A(3), BF_ROOT_B(4), BF_DATA(5).
autoboot.txt on each boot partition for A/B tryboot switching.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pi-gen username/password config triggers firstboot wizard AFTER
custom stages — reinstalls userconf and undoes our purge. Removed
those params from pi-gen-action config. Now create bfadmin user
directly in chroot script with password expiry on first login.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Piwiz still appeared after package purge alone. Now:
1. Purge ALL desktop packages (lxde, labwc, wayfire, lightdm, xorg)
2. autoremove orphaned deps
3. Force systemd.unit=multi-user.target in kernel cmdline
No desktop = no piwiz. Period.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Compose postgres uses POSTGRES_DB from BF_PG_DB. Dockerfile ARG
was BF_PG_DATABASE causing mismatch. PG error 3D000 db not found.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Package name changed from @betterframe/server to betterframe to
match BSB plugin path. Added nodered/package.json COPY so npm ci
can resolve the workspace dependency graph.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Base image: betterweb/service-base:9 (was :node which is v8)
- Plugin path: /home/bsb/node_modules/betterframe/ (flat pkg name)
- sec-config template: added package: betterframe to all 4 services
so BSB resolves plugins from the correct npm package
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
/mnt/bsb-plugins is declared as VOLUME in BSB base image. Build-time
COPY gets shadowed by empty anonymous volume at runtime. Use /home/bsb/
node_modules which BSB also searches and isn't a VOLUME.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
BSB container mode searches /mnt/bsb-plugins/node_modules/ for
plugins. Moved built output from /home/bsb to the correct external
plugin path at /mnt/bsb-plugins/node_modules/@betterframe/server/.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
BSB needs BSB_LIVE=true for production mode. Without it, warns about
non-production and tries to write sec-config.yaml (which is read-only).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
BSB entrypoint at /root/entrypoint.sh runs as root and drops
privileges itself. Our USER node blocked access to entrypoint.
Removed USER root/node, use absolute COPY paths, let BSB own
the user lifecycle.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
sec-config.yaml is now generated at Docker build time from
sec-config.template.yaml via envsubst. Secrets come from Coolify
build args (set in UI, never in git). Template uses ${VAR:-default}
placeholders — safe to commit to public repo.
- sec-config.yaml removed from git, added to .gitignore
- sec-config.template.yaml added (public, no secrets)
- Dockerfile.server: ARGs for all config, envsubst generates config
at build time, result is chmod 444 (read-only)
- Coolify compose: removed sec-config volume mount (baked in now)
- For native installs: copy template to sec-config.yaml, fill values
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolved coolify compose conflict — took remote bind mount pattern.
All paths now use /home/bsb (BSB container workdir, not /app).
Both compose files use bind mount for sec-config.yaml.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Dockerfile.server now uses betterweb/service-base:node as runtime
base instead of node:24-trixie-slim + manual bsb-plugin-cli. BSB
container handles entrypoint, user, plugin loading.
sec-config.yaml removed from Docker image — must be bind-mounted
at /app/sec-config.yaml. Both compose files updated with :ro mount.
All BF_* env vars removed from compose server service.
deploy/docker/sec-config.yaml deleted (was baked in, now mounted).
version.ts path updated for new workdir /app.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add cloud_accounts table to PostgreSQL tenant migrations (was only
in SQLite).
- Artifact cleanup now skips releases referenced by active/queued/paused
rollouts (CASCADE would delete the rollout).
- Add invisible cursor theme install to setup-pi-kiosk.sh (was only
in pi-gen image build).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cursor: install theme as index.theme (XCursor spec) not just
cursor.theme. Add WLR_XCURSOR_THEME env var for wlroots compat.
Piwiz: broader purge (rpi-first-boot-wizard, raspi-config triggers,
profile.d scripts, firstrun.sh). Mark first-boot done via userconf
marker file.
Migration: add encrypt_key_encrypted, cloud_accounts, and ONVIF event
columns to catch-all backfill so PRAGMA user_version skips can't miss
them.
Artifact cleanup: delete yanked firmware/OS files + prune to 5 most
recent per channel. Runs every 6h. Stops disk from filling up.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previous generator packed 5 fields in the image chunk header but Xcursor
format needs 9 (header_size, type, nominal, version, w, h, xhot, yhot,
delay). Missing version field → malformed → wlroots ignored it → fell
back to default visible cursor. Now writes correct 68-byte Xcursor with
all 9 header fields. Added more cursor names (x_cursor, pirate, sides).
Also: terminal UI shows bash-style cwd$ prompt, separates command from
output visually, auto-detects pwd after each command for prompt update.
The kiosk runs under NoNewPrivileges=yes (WebKit bwrap needs it). sudo
and nsenter both fail because they need privilege escalation which the
flag blocks. systemd-run --pipe spawns a SEPARATE service unit as root
in its own process tree, connected via stdin/stdout pipe. Not a child
of the kiosk process → NoNewPrivileges doesn't apply.
Also: enable rauc.service in pi-gen chroot (was never enabled → RAUC
daemon not running → rauc install fails → OS update silently broken).
Terminal spawns bash as bfkiosk (unprivileged) → can't read journal,
can't run rauc/systemctl, can't fix anything useful. Now runs
sudo bash --login (with fallback to plain bash if sudo unavailable).
Journal streaming: sudo journalctl instead of plain journalctl so
bfkiosk can read system journal without systemd-journal group.
Pi-gen image: drops /etc/sudoers.d/betterframe-kiosk granting bfkiosk
passwordless sudo. Gated by the on-screen code + lockout ladder, so
root access still requires physical presence.
Root cause: kiosk never stored cluster_key from pairing response.
Bundle ships onvif_password_encrypted (AES-256-GCM with cluster key).
decrypt_cluster was a stub returning None → empty password → WSSE auth
fails → CreatePullPoint rejected → no events ever.
Fix:
1. ClaimResp now includes cluster_key field
2. Stored encrypted at rest alongside kiosk_key (at_rest.rs)
3. Loaded at bundle render, passed to onvif_events::start()
4. decrypt_cluster implements full AES-256-GCM: parse v1.<iv>.<tag>.<ct>
format, base64url decode, decrypt with cluster key
Also: removed BF_ENABLE_ONVIF_EVENTS env gate — if camera is type=onvif
with onvif_host, subscribe. Gate was redundant with the type filter.
Also: bump Angie proxy_read_timeout to 600s on /api/admin/ for OS
bundle import (downloads ~1GB from GitHub, was timing out at 60s).
NOTE: existing paired kiosks won't have cluster_key stored. They need
to re-pair (delete + re-add) to receive it. New pairings get it
automatically.
1. Transparent cursor theme: 1x1 pixel Xcursor for every shape, set as
system default via XCURSOR_THEME=betterframe-empty. Nuclear fix for
Pi 5 GPU ignoring XCURSOR_SIZE.
2. Full VT lockdown: mask ALL gettys (tty1-6 + templates), logind
NAutoVTs=0 + ReserveVT=0, mask emergency/rescue targets. Ctrl+Alt+Fx
reaches nothing. No login screen ever.
3. Auto-reboot: FailureAction=reboot-force + StartLimitAction=reboot-force
on kiosk unit. If cage/app can't stay running → system reboots rather
than showing a blank screen or login prompt.
4. Purge ALL Pi setup wizards: piwiz, userconf-pi, rpi-first-boot-wizard,
initial-setup, pi-greeter, rpd-plym-splash. Nuke autostart files,
mask systemd units. "Configure your Raspberry" never shows.
Kiosk side (remote_debug.rs + ws_client.rs refactor):
- Journal streaming: server sends journal-start → kiosk spawns
journalctl -f, pipes lines back as journal-line messages via WS.
journal-stop kills the process. On-demand, not always-on.
- Terminal: server sends terminal-request → kiosk checks lockout +
firmware_channel == "dev" → generates 8-char code displayed on
screen as fullscreen overlay (NOT logged) → server relays admin's
code via terminal-auth → kiosk validates with constant-time compare
→ on success spawns bash, relays I/O as base64 terminal-data.
- Lockout: 3 failed codes per boot → lockout_count++. 3 lockouts
(9 total failures) → permanent (reflash only). Reboot resets
attempt counter, not lockout counter. Successful pairing resets all.
- ws_client.rs rewritten with split reader/writer + tokio::select!
for multiplexing incoming WS messages with outbound journal/terminal
data from sync threads.
Server side (coordinator-ws + routes-admin):
- New admin debug WS endpoint: /ws/admin/debug/:kioskId. Authenticated
via admin API key (query param) or session cookie. Relays messages
bidirectionally between admin browser ↔ kiosk.
- Admin pages: /admin/kiosks/:id/logs (journal viewer with start/
stop/clear) and /admin/kiosks/:id/terminal (code entry + terminal
area). Both open in new tabs from the kiosk detail page.
- Angie proxy config updated with /ws/admin/debug/ location block.
Security:
- Terminal only on dev channel
- Code displayed physically on screen, never logged or stored server-side
- Lockout: 3/boot, 3 lockouts = permanent, pairing resets
- Kiosk responds "locked" without specifying which lockout triggered
Coolify doesn't include .git in Docker build context, causing build
failure. Revert to ARG-based version stamping: compose passes
BF_SERVER_VERSION from Coolify's SOURCE_COMMIT/COOLIFY_GIT_COMMIT
env vars as a build arg, Dockerfile writes it to .bf-version. Removed
git from builder apt install (no longer needed).
/opt/betterframe/kiosk/ now owned bfkiosk:bfkiosk so OTA can write
.new/.prev files. Marker path in Rust code aligned with rollback
script expectation (/var/lib/betterframe/kiosk/firmware-applying.json).
Coolify pulls from GitHub and runs docker compose build — no guaranteed
env vars like SOURCE_COMMIT. Previous approach relied on ARG/ENV
passthrough that silently defaulted to "dev".
Fix: install git in the builder stage, COPY .git into context, run
git describe --tags --always to derive the version, write it to
/app/server/.bf-version. version.ts reads this file as a fallback
between env vars and the "dev" literal.
Chain: BF_SERVER_VERSION env → BF_BUILD_VERSION env → .bf-version file
→ COOLIFY_GIT_COMMIT env → SOURCE_COMMIT env → "dev".
Also: fix .gitignore for rauc-signing/ (was under wrong path).
Kernel dropped to initramfs because root=LABEL=BF_ROOT_A in cmdline.txt
but the ext4 filesystem had no label set (pi-gen's default is unlabeled).
dd copies raw bytes — any label must be set on the standalone file BEFORE
writing into the output image.
Add e2label BF_ROOT_A on rootfs.ext4 + fatlabel BF_BOOT_A / BF_BOOT_B
on each bootfs copy after patching cmdline.txt but before dd.
losetup -fP partition scanning failed on CI runner ("failed to open
partition 1"). Rewrite to parse partition start/size from sfdisk -J
(JSON output) via jq, then dd with skip/seek at exact sector offsets.
Only uses losetup for individual file images (selector.vfat, rootfs,
bootfs) where partition scanning isn't needed.
Also: add jq to CI apt install, drop xz compression from -9 to -6
(faster, still ~85% ratio on rootfs), free source image earlier to
avoid disk exhaustion on runners with tight scratch.
- Bake @flowfuse/node-red-dashboard into Node-RED Docker image
- Fire-and-forget syncDashboardsFromNodered() on GET /admin/entities
so dashboard tabs appear without manual sync button click
New kiosk/src/onvif_events.rs: for each ONVIF camera in the bundle,
creates a PullPoint subscription, polls every 3s, parses
NotificationMessage XML into structured JSON (topic + source key/values
+ data key/values + timestamp), and POSTs to /api/kiosk/event with
source_type=onvif + camera_id.
Forwards ALL event topics: motion, ANPR (LicensePlateRecognition),
line crossing, intrusion, digital input, analytics, tamper — everything
the camera exposes. Node-RED sorts what matters.
Subscription lifecycle:
- CreatePullPointSubscription with 60s InitialTerminationTime
- Renew every 55s before timeout
- Unsubscribe on bundle change / shutdown
- Auto-resubscribe on pull/renew failure (30s backoff)
- Generation tracking via Weak<()> so old workers self-terminate
when start() is called with a new bundle
WSSE PasswordDigest auth for SOAP calls — same scheme the server's
onvif.ts uses. sha1 crate added.
BundleCamera extended with onvif_host/port/username/password_encrypted
fields (server already ships them; kiosk just wasn't deserializing).
Gated by BF_ENABLE_ONVIF_EVENTS=1. Enabled by default in the pi-gen
image env file.
TODO: cluster-key-based decryption of onvif_password_encrypted. For
now relies on the RTSP URI having plaintext credentials embedded (which
the ONVIF import path already ensures via rtspWithCredentials).
Two image-side hardening pieces both small enough to ship together.
deploy/nftables/nftables.conf — single ruleset installed at /etc/nftables.conf.
Default-drop input. Allowed: loopback, established/related, ratelimited
ICMP, kiosk local API :18090 from RFC1918 / RFC4193 / link-local sources
only. SSH stays gated by sshd-disabled (image build sets enable-ssh: 0
and 01-run-chroot masks it); the firewall rule for :22 is left commented
in for triage scenarios. Forward dropped. Output left wide open — kiosk
needs to dial out to arbitrary RTSP cameras + the BF server (which may
live on the public internet) without explicit allowlisting.
deploy/systemd/betterframe-firstboot.{service,sh} — runs once per device
before betterframe-kiosk starts. Generates a 24-char unambiguous-glyph
password, applies via chpasswd, stores at /etc/betterframe/admin-password
(0400 root), and prints a banner to tty1 so an HDMI-attached operator
can transcribe it during the boot window before cage takes over the
screen. Marker at /var/lib/betterframe/.firstboot-complete prevents
re-run on subsequent boots. Without this, every kiosk built from the
same image shipped with bfadmin:betterframe — a single password leak
compromises the entire fleet.
Future follow-up: post the rotated password (encrypted with cluster_key)
to the BF server via heartbeat so admin UI can surface it. Not in this
commit; the local file + tty banner are the only retrieval paths today.
Phase 2b. Bake the runtime side of RAUC into the curated image so a
freshly-flashed kiosk can accept .raucb bundles immediately:
- Add `rauc` + `dosfstools` to the apt package list.
- Drop deploy/rauc/system.conf to /etc/rauc/system.conf (already declares
the A/B slot layout that repartition-image.sh produces).
- Drop deploy/rauc/betterframe-rauc-boot.sh to
/usr/local/sbin/betterframe-rauc-boot.sh — the custom bootloader
backend that flips the BF_BOOTSEL autoboot.txt to point at the
freshly-installed slot via Pi 5 tryboot.
- Drop deploy/rauc/ca-cert.pem (operator-supplied, committed) to
/etc/rauc/keyring.pem so rauc can verify CMS signatures. If the cert
isn't committed yet, image build emits a workflow warning and the
kiosk image installs but refuses every bundle — image still flashes,
just no OS OTA until the cert is committed.
- Enable BF_ENABLE_OS_OTA=1 in /etc/default/betterframe-kiosk so the
kiosk Rust consumer actually polls for bundles. Set to 0 to pin OS
version for a specific kiosk.
mark-good was already wired (deploy/systemd/betterframe-rauc-mark-good.{service,sh}).
The kiosk's heartbeat loop also calls `rauc status mark-good` as a
belt+suspenders backup; both are idempotent.
Phase 2a of OS OTA: post-process pi-gen output into a RAUC-compatible
A/B layout. New deploy/rauc/repartition-image.sh:
- Decompresses the stock pi-gen 2-partition image
- Extracts bootfs (vfat) + rootfs (ext4) blobs
- Compacts rootfs with resize2fs -M and grows back with 25% headroom
- Patches /etc/fstab inside rootfs to use LABEL=BF_BOOT_A /
LABEL=BF_ROOT_A / LABEL=BF_DATA (slot-agnostic; RAUC re-labels per
slot on install)
- Stamps /etc/betterframe/{os-version,os-compatibility} for the kiosk's
os_update.rs to read at runtime
- Builds two bootfs copies, each with cmdline.txt root= rewritten to
the matching ROOT slot
- Lays out 6 GPT partitions: BF_BOOTSEL (autoboot.txt with tryboot
pointing at boot_partition=2 / [tryboot] boot_partition=3), BF_BOOT_A,
BF_BOOT_B, BF_ROOT_A (populated), BF_ROOT_B (empty, RAUC fills on
first install), BF_DATA
- Recompresses with xz -T0
build-bundle.sh now takes the already-extracted slot images so the
.raucb bundle re-uses the exact same blobs that ship inside the A/B
initial-flash image — no duplication, no drift.
CI wires the repartition step between pi-gen output and the GitHub
Release upload. Ships the A/B image (not the stock pi-gen one).
Also: bump Blacksmith binary builders from 2/4 vCPU to 8 vCPU each.
Image job stays on GitHub's ubuntu-24.04-arm (Blacksmith arm kernel
6.5 doesn't ship binfmt_misc as a loadable module, which pi-gen-action's
defensive modprobe step still requires).
What's still pending:
- In-image RAUC install (rauc package + drop system.conf + CA cert
at /etc/rauc/keyring.pem). Without this, the image boots A/B-laid-
out but rauc install commands have no daemon to talk to.
- Admin UI for OS releases + rollouts (task #4).
Phase 1 of the OS OTA pipeline. Three pieces:
scripts/gen-rauc-signing-keys.sh — one-shot helper that issues an
Ed25519 X.509 CA + signing cert pair. Operator runs locally, commits
the CA cert (for embedding in kiosk image at /etc/rauc/keyring.pem),
stores the signing pair as GitHub Actions secrets
(BF_RAUC_SIGNING_CERT + BF_RAUC_SIGNING_KEY), keeps the CA private
key offline. RAUC verifies bundles against the keyring in the image.
deploy/rauc/build-bundle.sh — takes the pi-gen .img.xz, parses its
partition table with sfdisk, dd-extracts bootfs (vfat) + rootfs
(ext4) into a staging dir, renders manifest.raucm.in with version
+ git sha, runs `rauc bundle --cert= --key=` to produce a signed
.raucb. Verifies the bundle round-trips with `rauc info`.
build.yml gains two gated steps:
- "Build RAUC bundle": runs only when both signing secrets are set,
uploads .raucb as a release asset alongside the .img.xz.
- "Auto-import OS bundle into BF server": POSTs the GH release asset
URL to ${BF_AUTOIMPORT_URL}/api/admin/os/import so the server
pulls + stores the bundle. Mirrors the kiosk-binary auto-import
flow that already worked.
Compatibility string is `betterframe-rpi5-aarch64` (matches the value
already declared in deploy/rauc/system.conf). Channel passed through
from inputs (dev for master pushes, stable/beta for tags).
What's NOT in this commit:
- Pi image A/B partition layout (custom genimage / pi-gen patch)
- rauc package install + keyring drop in pi-gen stage
- Kiosk-side os_update.rs Rust consumer that polls /api/kiosk/os/check
- Admin UI for releases + rollouts
A bundle built today reaches /api/admin/os/import on the server but
isn't installable yet — kiosks have no consumer and no A/B layout.
That's the next 3 phases. Bundle production needs to be solid first
so the kiosk side can be tested against real artifacts.
BF_ENABLE_APP_OTA gates the periodic firmware check in ui.rs:449.
Our curated image is the place where it should be on — operator can
flip to 0 in /etc/default/betterframe-kiosk to pin a specific build.
Without this, a fresh image never picked up new dev releases even when
the auto-import secrets were configured.
WebView showed a grey/black half-square in the top-left — that's GDK's
"none" cursor rendered by WebKit's own surface, which ignores the
GTK-window CSS we set elsewhere. Inject a WebKit UserStyleSheet that
applies cursor:none !important to every page + frame at User priority,
overriding page-author CSS.
For the boot gap (cage start → first kiosk frame), set XCURSOR_SIZE=1
and WLR_NO_HARDWARE_CURSORS=1 in the systemd unit. SW fallback honors
the 1-pixel size; HW cursors don't, which is why a default arrow leaks
through on some Pi GPUs.
Kiosk's pipeline.rs needs the gtk4paintablesink element to bridge
GStreamer → GTK4 Picture. Plugin ships in Debian package
gstreamer1.0-gtk4 (separate from plugins-base/good/bad). Without it
ElementFactory::make returns Err → create_camera_pipeline returns None
→ ensure_warm returns None → every camera cell renders
"(no stream)". Add to both the pi-gen image stage and the
setup-pi-kiosk.sh installer.