Host-side tonistiigi/binfmt registration doesn't propagate into the
pi-gen-action's nested Docker container's view of /proc/sys/fs/binfmt_misc.
usimd/pi-gen-action's extra-host-dependencies input runs apt-get inside
the pi-gen container before pi-gen launches — install qemu-user-static
+ binfmt-support there so the chroot's arm64 binaries can execute.
apt's qemu-user-static + update-binfmts produces a registration that
pi-gen's nested Docker container still couldn't see. Switch to the
canonical tonistiigi/binfmt approach: privileged container that
installs QEMU statically with the F (fix-binary) flag, so the kernel
opens the qemu-aarch64-static binary at registration time and uses it
for all subsequent arm64 execs — independent of which container the
exec happens in.
Plus diagnostic: ls /proc/sys/fs/binfmt_misc + cat qemu-aarch64
detail, so next run's log surfaces whether registration actually
landed.
docker/setup-qemu-action registers binfmt via a privileged side container;
pi-gen-action's own nested Docker container doesn't inherit the
registration. Result: arm64 ELFs in the pi-gen chroot still fail to
exec, exit 1 before any stage runs.
apt-installed qemu-user-static + binfmt-support writes persistent
binfmt_misc entries to the kernel that propagate to every container
share. Pair with update-binfmts --enable qemu-aarch64 and a sanity
ls -la /proc/sys/fs/binfmt_misc/qemu-aarch64.
Real cause of last pi-gen failure was surfaced by verbose-output:
WARNING: Only a native build environment is supported.
arm64: not supported on this machine/kernel
ubuntu-latest is x86_64; pi-gen builds an arm64 image and chroots into
it during stages, requiring binfmt_misc handlers for arm64. Add
docker/setup-qemu-action before the pi-gen step.
While here, audit + bump every action version (pinned to current
majors):
actions/checkout v4 → v6
actions/upload-artifact v4 → v7
actions/download-artifact v4 → v8
softprops/action-gh-release v2 → v3
docker/setup-qemu-action @v4 (new)
usimd/pi-gen-action @v1 (already current major)
dtolnay/rust-toolchain @stable (rolling channel — keep)
Reverts misdiagnosis. pi-gen defaults to trixie since the Debian 13
release, which has gtk4 4.14 + libwebkitgtk-6.0 stock — no backports
needed. Build container, kiosk gtk feature gate, and pi-gen target all
realigned to trixie.
Actual reason last image run failed: our custom stage was missing the
mandatory prerun.sh (pi-gen calls it to seed ROOTFS_DIR from the
previous stage) and the EXPORT_IMAGE marker file (signals 'bake an
image at the end of this stage'). Both added.
Asset upload now globs deploy/*.img.xz so any extra exports stage2
produces ship alongside our customised one.
Repo is public → unlimited Actions minutes, so the 30-60 min pi-gen
bake doesn't have a cost gate. Master pushes now produce the full
asset set (binaries + image), same as tag releases.
Replaces release-kiosk.yml + release-image.yml with two coupled workflows:
release.yml — entrypoint. Computes version/channel/tag:
- master push → semver patch bump from latest stable tag, append
-dev.<shortsha>, create lightweight tag + prerelease record
- v* tag push → use tag verbatim, channel from suffix (-beta./-dev. or
stable), create release if missing
Then invokes build.yml via uses: ./.github/workflows/build.yml.
build.yml — reusable (workflow_call). Single source of truth for asset
production:
- kiosk binary matrix (aarch64, x86_64) in debian:trixie-slim
- flashable .img.xz via pi-gen using the aarch64 artifact (gated by
build-image input; master pushes default false to keep dev cycles
fast, tag pushes default true for a full release)
Both jobs attach to the release at tag_name=${{ inputs.tag }}.
Concurrency: master-branch runs cancel superseded peers; tag runs never
cancel. CI auto-import to a running BF server (BF_AUTOIMPORT_URL +
BF_AUTOIMPORT_API_KEY repo secrets) still wired.
New workflow .github/workflows/release-image.yml takes a tagged kiosk
release binary, layers it onto Raspberry Pi OS Trixie Lite via a custom
pi-gen stage, and publishes the resulting .img.xz back to the GitHub
Release.
Custom stage deploy/pi-gen/stage-betterframe-client/:
- 00-install-packages: cage, seatd, plymouth, gtk4 runtime, gstreamer,
libwebkitgtk-6.0, wlr-randr, ca-certificates
- 01-install-kiosk: drops the prebuilt kiosk binary, systemd unit,
cage PAM stack, firmware-rollback hook, plymouth theme. Creates
bfkiosk user, sets multi-user.target, masks all display managers,
purges piwiz, edits cmdline/config for the BF splash. Mirrors
setup-pi-kiosk.sh but baked into the image.
End state: rpi-imager → SD → boot → pairing screen on the HDMI display,
no operator setup steps. Kiosk auto-discovers server via discover_server()
(localhost → mDNS → frame-eu.betterportal.net).
Heavy build (~30-60 min on GH-hosted Ubuntu) so tag-push triggered, not
master. Workflow_dispatch also supports baking an existing release tag's
binary into a fresh image without re-tagging.
- WorkerMsg made pub + re-exported at crate root so local_server can send
through the UI channel.
- ed25519_dalek::pkcs8::DecodePublicKey trait import — needed for
VerifyingKey::from_public_key_pem call site.
- Workflow: pushes to master now auto-trigger a dev-channel build (in
addition to tag-pushes for stable/beta). Concurrency group cancels
superseded master builds; tag builds never cancel each other.
Pi OS Bookworm + Debian bookworm both ship libgtk-4 4.8.3. No code in
the kiosk uses 4.12+ APIs (compute_bounds, WidgetPaintable, Picture,
add_tick_callback, Fixed, set_content_fit are all <= 4.8). Swap
gtk4 feature v4_12 → v4_8 and drop the bookworm-backports juggling
in CI.
- server Dockerfile installs wget — bookworm-slim doesn't include it
by default, so the healthcheck CMD silently failed → Coolify marked
the container unhealthy.
- nodered healthcheck swapped to /nrdp/ (always 200 when runtime up)
via wget --spider; previous /nrdp/auth/login returned non-2xx when
adminAuth disabled.
- start_period bumped to 90s for nodered's flow load on smaller hosts.
- Kiosk discovery: cloud fallback now frame-eu.betterportal.net per
the managed-fleet endpoint.
Previous deploy left /data/settings.js as a DIRECTORY (Docker auto-mkdir
from a failed bind mount earlier). cp from non-root user then failed
'Permission denied' writing inside it.
Entrypoint now:
- Detects + rm -rf the stale directory
- Seeds /data/settings.js from /usr/src/bf-settings.js
- Chowns /data to node-red
- exec su-exec node-red:node-red to drop privileges before npm start
The /data named volume hides anything Dockerfile COPYs into /data, so
the previous CMD override pointing at /usr/src/bf-settings.js didn't
help — Node-RED's launch script still looks for /data/settings.js by
default, which doesn't exist after the volume overlays.
Solution: entrypoint wrapper copies /usr/src/bf-settings.js to
/data/settings.js on first boot when missing, then exec's npm start.
Subsequent boots keep the user-edited version in the volume.
Adds aggressive normalisation to tryParsePrivateKey:
- Strip UTF-8 BOM
- Replace smart quotes (" " ' ') with ASCII
- Strip multiple layers of wrapping quotes
- Combine escape-unfold with quote-strip (env vars that quote AND escape)
- Strip whitespace inside base64 candidate before decode
On parse failure, dumps length + head/tail samples + first-byte hex so
the operator can spot exactly what shape the env var arrived in.
Coolify / docker compose env injection routinely strips real newlines or
wraps in quotes, causing createPrivateKey to throw ERR_OSSL_UNSUPPORTED
and crashing the server before it can even start.
tryParsePrivateKey now attempts: literal, \n→LF, CRLF→LF, quote-stripped,
base64-decoded, and single-line PEM re-wrapped to 64-col. On total
failure, logs a clear warning and falls back to on-disk / generated key
instead of crashing.
Coolify deployments don't always carry the full source tree on disk
at the bind-mount source path. Mounting a missing file lets Docker
auto-create a directory at the target, which then fails to mount over
the file the image expects.
Fix: bake config files into the images themselves:
- Dockerfile.server COPYs deploy/docker/sec-config.yaml → /app/server/.
Env vars (BF_*) still override at runtime per env-overrides.ts.
- New Dockerfile.angie wraps nginx:alpine + baked betterframe.docker.conf.
- Dockerfile.nodered COPYs nodered-settings.js to /usr/src/bf-settings.js
(outside the /data volume) and uses --settings to point at it.
Compose drops the three bind mounts; volumes are now strictly
runtime state (DB + secrets, Node-RED flows). Users who want a
different sec-config still get full control via env overrides or
Coolify's Storage UI.
Coolify passes --project-directory <repo-root> so relative paths in
compose resolved from there, not from the compose file's directory.
context: ../.. then climbed to / and lstat /deploy failed.
Moving compose to repo root makes every relative path
project-dir-relative regardless of who's invoking compose. Local
'docker compose up' from repo root and Coolify's
--project-directory + -f both resolve identically.
Coolify users: update the resource's compose path to 'docker-compose.yml'
(was 'deploy/docker/docker-compose.yml'). Existing named volumes carry
over since the named: directive keeps them.
BF_DATA_VOLUME_NAME, NODERED_DATA_VOLUME_NAME, BF_HOST_PORT keep the
compose public while letting per-deployment specifics (host paths,
multiple staging/prod instances on one host, alternate edge ports)
land in Coolify's env tab. Defaults preserve current behaviour.
When the active layout switches, cells that exist in both old + new (same
camera, same URL, same HTML) now slide + scale from their old screen
position to the new one over 350ms (ease-out cubic). Fresh cells fade in;
removed cells fade out where they were.
Implementation:
- Each cell widget gets a stable widget_name (cam:<id>:<selector>,
web:<url>, html:<hash>) so old/new can be matched.
- Before swap, capture each cell's bounds + a WidgetPaintable snapshot.
- New grid wrapped in an Overlay; a Fixed ghost layer hosts the animated
Picture widgets driven by add_tick_callback + ease-out cubic.
- Once the window finishes the animation timer, the overlay is unwrapped
back to a plain grid so subsequent renders don't accumulate layers.
LICENSE.md states AGPL-3.0-only OR Commercial dual license (matches the
SPDX expression in every package.json + Cargo.toml). LICENSE-AGPL.txt
is the canonical FSF text. LICENSE-COMMERCIAL.md covers when a
commercial license is required and how to obtain one.
Kiosk now exposes :18090 with two surfaces:
- GET /local/layout/:id?key=<kiosk_local_key>
Bookmark-friendly layout switch on this kiosk. Auth = kiosk-generated
local key (32 random bytes, hex, stored at <state_dir>/local.key).
- ANY /proxy/* — forwards to BF server with the request's Authorization
header preserved. Lets LAN clients reach a cloud-hosted BF server via
the kiosk's local socket; kiosk adds no auth of its own.
Heartbeat reports {local_key, local_port}; kiosks table grows
local_key/local_port/local_last_ip columns. Admin kiosk edit page now
shows the local URLs as a copy-paste block.
Override port: BF_KIOSK_LOCAL_PORT. Disable: BF_KIOSK_LOCAL_DISABLE=1.
Adds an OS + dist upgrade step before the BetterFrame install logic so
re-running the script keeps the host current. Uses
--force-confdef --force-confold
so package maintainer scripts never block on prompts, and follows with
autoremove + autoclean. Kernel/libc updates set /var/run/reboot-required
which the existing REBOOT_NEEDED guard picks up → auto-reboot at end.
BF_SKIP_UPGRADE=1 bypasses the upgrade for fast iteration.
Web and HTML cells were rebuilt + reloaded on every layout switch,
losing JS state and incurring a full page load each time. Mirror the
camera pool: hold WebViews in WARM_WEBVIEWS keyed by URL (or hash of
inline HTML), reuse on switch-back, unparent + cool on switch-away,
drop after the same cooling timer. Identical content in two layouts
shares one WebView.
Three related fixes:
1. Idle reverts (and any other kiosk-initiated layout switch) now POST
layout.changed to /api/kiosk/event. Previously the server only emitted
on admin-initiated switches, so Node-RED never saw the idle revert.
2. Server's /api/kiosk/event splays the payload to the top level when
the topic has a dedicated trigger node (layout.changed, kiosk.changed,
kiosk.status, display.power.changed, camera.changed). The trigger
nodes expect flat shapes matching the admin emit; the old wrapped
shape left every field undefined.
3. Auto-provisioning of bf-server-config in Node-RED: extend retry
window to ~5 min, log per attempt, force v2 API + full-deploy header
so credentials inline get accepted, surface response body on failure.
Pool was keyed by camera_id, so a cell flipping M→S tore down the old
pipeline and started fresh. With (camera_id, badge) keys the main and
sub variants live alongside each other: switching badge promotes the
new one to Warm and leaves the previous one to cool down via the normal
state machine, so flipping back inside the cooldown is instant.
ensure_warm no longer touches sibling badge entries. recompute_global_
state computes warm/hot sets as (cam, badge) pairs by calling
pick_stream per cell with its area fraction, so the planner sees what
ensure_warm will actually create.
Two ergonomics fixes so one invocation does the right thing:
1. After git pull, re-exec the script if the installer itself changed
in the pull. Previously you'd need a second run to pick up new
logic. BF_REEXEC=1 guard prevents loops.
2. Track REBOOT_NEEDED when cmdline.txt / config.txt get edited or
/var/run/reboot-required appears (apt kernel/libc update). At end
of run, auto-reboot after a 10s cancellable window. Override with
BF_NO_REBOOT=1.
WebKitGTK launches bubblewrap for its web-content process; bwrap refuses
to run when the parent process still carries unexpected CAP_* bits ("but
not setuid, old file caps config?"). Setting CapabilityBoundingSet= +
AmbientCapabilities= empty and NoNewPrivileges=yes gives bwrap a clean
caps slate to drop from, so the sandbox initialises and web/dashboard
cells render instead of crashing the kiosk.