Host-side tonistiigi/binfmt registration doesn't propagate into the
pi-gen-action's nested Docker container's view of /proc/sys/fs/binfmt_misc.
usimd/pi-gen-action's extra-host-dependencies input runs apt-get inside
the pi-gen container before pi-gen launches — install qemu-user-static
+ binfmt-support there so the chroot's arm64 binaries can execute.
apt's qemu-user-static + update-binfmts produces a registration that
pi-gen's nested Docker container still couldn't see. Switch to the
canonical tonistiigi/binfmt approach: privileged container that
installs QEMU statically with the F (fix-binary) flag, so the kernel
opens the qemu-aarch64-static binary at registration time and uses it
for all subsequent arm64 execs — independent of which container the
exec happens in.
Plus diagnostic: ls /proc/sys/fs/binfmt_misc + cat qemu-aarch64
detail, so next run's log surfaces whether registration actually
landed.
docker/setup-qemu-action registers binfmt via a privileged side container;
pi-gen-action's own nested Docker container doesn't inherit the
registration. Result: arm64 ELFs in the pi-gen chroot still fail to
exec, exit 1 before any stage runs.
apt-installed qemu-user-static + binfmt-support writes persistent
binfmt_misc entries to the kernel that propagate to every container
share. Pair with update-binfmts --enable qemu-aarch64 and a sanity
ls -la /proc/sys/fs/binfmt_misc/qemu-aarch64.
Real cause of last pi-gen failure was surfaced by verbose-output:
WARNING: Only a native build environment is supported.
arm64: not supported on this machine/kernel
ubuntu-latest is x86_64; pi-gen builds an arm64 image and chroots into
it during stages, requiring binfmt_misc handlers for arm64. Add
docker/setup-qemu-action before the pi-gen step.
While here, audit + bump every action version (pinned to current
majors):
actions/checkout v4 → v6
actions/upload-artifact v4 → v7
actions/download-artifact v4 → v8
softprops/action-gh-release v2 → v3
docker/setup-qemu-action @v4 (new)
usimd/pi-gen-action @v1 (already current major)
dtolnay/rust-toolchain @stable (rolling channel — keep)
Reverts misdiagnosis. pi-gen defaults to trixie since the Debian 13
release, which has gtk4 4.14 + libwebkitgtk-6.0 stock — no backports
needed. Build container, kiosk gtk feature gate, and pi-gen target all
realigned to trixie.
Actual reason last image run failed: our custom stage was missing the
mandatory prerun.sh (pi-gen calls it to seed ROOTFS_DIR from the
previous stage) and the EXPORT_IMAGE marker file (signals 'bake an
image at the end of this stage'). Both added.
Asset upload now globs deploy/*.img.xz so any extra exports stage2
produces ship alongside our customised one.
Repo is public → unlimited Actions minutes, so the 30-60 min pi-gen
bake doesn't have a cost gate. Master pushes now produce the full
asset set (binaries + image), same as tag releases.
Replaces release-kiosk.yml + release-image.yml with two coupled workflows:
release.yml — entrypoint. Computes version/channel/tag:
- master push → semver patch bump from latest stable tag, append
-dev.<shortsha>, create lightweight tag + prerelease record
- v* tag push → use tag verbatim, channel from suffix (-beta./-dev. or
stable), create release if missing
Then invokes build.yml via uses: ./.github/workflows/build.yml.
build.yml — reusable (workflow_call). Single source of truth for asset
production:
- kiosk binary matrix (aarch64, x86_64) in debian:trixie-slim
- flashable .img.xz via pi-gen using the aarch64 artifact (gated by
build-image input; master pushes default false to keep dev cycles
fast, tag pushes default true for a full release)
Both jobs attach to the release at tag_name=${{ inputs.tag }}.
Concurrency: master-branch runs cancel superseded peers; tag runs never
cancel. CI auto-import to a running BF server (BF_AUTOIMPORT_URL +
BF_AUTOIMPORT_API_KEY repo secrets) still wired.
New workflow .github/workflows/release-image.yml takes a tagged kiosk
release binary, layers it onto Raspberry Pi OS Trixie Lite via a custom
pi-gen stage, and publishes the resulting .img.xz back to the GitHub
Release.
Custom stage deploy/pi-gen/stage-betterframe-client/:
- 00-install-packages: cage, seatd, plymouth, gtk4 runtime, gstreamer,
libwebkitgtk-6.0, wlr-randr, ca-certificates
- 01-install-kiosk: drops the prebuilt kiosk binary, systemd unit,
cage PAM stack, firmware-rollback hook, plymouth theme. Creates
bfkiosk user, sets multi-user.target, masks all display managers,
purges piwiz, edits cmdline/config for the BF splash. Mirrors
setup-pi-kiosk.sh but baked into the image.
End state: rpi-imager → SD → boot → pairing screen on the HDMI display,
no operator setup steps. Kiosk auto-discovers server via discover_server()
(localhost → mDNS → frame-eu.betterportal.net).
Heavy build (~30-60 min on GH-hosted Ubuntu) so tag-push triggered, not
master. Workflow_dispatch also supports baking an existing release tag's
binary into a fresh image without re-tagging.
- WorkerMsg made pub + re-exported at crate root so local_server can send
through the UI channel.
- ed25519_dalek::pkcs8::DecodePublicKey trait import — needed for
VerifyingKey::from_public_key_pem call site.
- Workflow: pushes to master now auto-trigger a dev-channel build (in
addition to tag-pushes for stable/beta). Concurrency group cancels
superseded master builds; tag builds never cancel each other.
Pi OS Bookworm + Debian bookworm both ship libgtk-4 4.8.3. No code in
the kiosk uses 4.12+ APIs (compute_bounds, WidgetPaintable, Picture,
add_tick_callback, Fixed, set_content_fit are all <= 4.8). Swap
gtk4 feature v4_12 → v4_8 and drop the bookworm-backports juggling
in CI.
- server Dockerfile installs wget — bookworm-slim doesn't include it
by default, so the healthcheck CMD silently failed → Coolify marked
the container unhealthy.
- nodered healthcheck swapped to /nrdp/ (always 200 when runtime up)
via wget --spider; previous /nrdp/auth/login returned non-2xx when
adminAuth disabled.
- start_period bumped to 90s for nodered's flow load on smaller hosts.
- Kiosk discovery: cloud fallback now frame-eu.betterportal.net per
the managed-fleet endpoint.
Previous deploy left /data/settings.js as a DIRECTORY (Docker auto-mkdir
from a failed bind mount earlier). cp from non-root user then failed
'Permission denied' writing inside it.
Entrypoint now:
- Detects + rm -rf the stale directory
- Seeds /data/settings.js from /usr/src/bf-settings.js
- Chowns /data to node-red
- exec su-exec node-red:node-red to drop privileges before npm start
The /data named volume hides anything Dockerfile COPYs into /data, so
the previous CMD override pointing at /usr/src/bf-settings.js didn't
help — Node-RED's launch script still looks for /data/settings.js by
default, which doesn't exist after the volume overlays.
Solution: entrypoint wrapper copies /usr/src/bf-settings.js to
/data/settings.js on first boot when missing, then exec's npm start.
Subsequent boots keep the user-edited version in the volume.
Adds aggressive normalisation to tryParsePrivateKey:
- Strip UTF-8 BOM
- Replace smart quotes (" " ' ') with ASCII
- Strip multiple layers of wrapping quotes
- Combine escape-unfold with quote-strip (env vars that quote AND escape)
- Strip whitespace inside base64 candidate before decode
On parse failure, dumps length + head/tail samples + first-byte hex so
the operator can spot exactly what shape the env var arrived in.
Coolify / docker compose env injection routinely strips real newlines or
wraps in quotes, causing createPrivateKey to throw ERR_OSSL_UNSUPPORTED
and crashing the server before it can even start.
tryParsePrivateKey now attempts: literal, \n→LF, CRLF→LF, quote-stripped,
base64-decoded, and single-line PEM re-wrapped to 64-col. On total
failure, logs a clear warning and falls back to on-disk / generated key
instead of crashing.
Coolify deployments don't always carry the full source tree on disk
at the bind-mount source path. Mounting a missing file lets Docker
auto-create a directory at the target, which then fails to mount over
the file the image expects.
Fix: bake config files into the images themselves:
- Dockerfile.server COPYs deploy/docker/sec-config.yaml → /app/server/.
Env vars (BF_*) still override at runtime per env-overrides.ts.
- New Dockerfile.angie wraps nginx:alpine + baked betterframe.docker.conf.
- Dockerfile.nodered COPYs nodered-settings.js to /usr/src/bf-settings.js
(outside the /data volume) and uses --settings to point at it.
Compose drops the three bind mounts; volumes are now strictly
runtime state (DB + secrets, Node-RED flows). Users who want a
different sec-config still get full control via env overrides or
Coolify's Storage UI.
Coolify passes --project-directory <repo-root> so relative paths in
compose resolved from there, not from the compose file's directory.
context: ../.. then climbed to / and lstat /deploy failed.
Moving compose to repo root makes every relative path
project-dir-relative regardless of who's invoking compose. Local
'docker compose up' from repo root and Coolify's
--project-directory + -f both resolve identically.
Coolify users: update the resource's compose path to 'docker-compose.yml'
(was 'deploy/docker/docker-compose.yml'). Existing named volumes carry
over since the named: directive keeps them.
BF_DATA_VOLUME_NAME, NODERED_DATA_VOLUME_NAME, BF_HOST_PORT keep the
compose public while letting per-deployment specifics (host paths,
multiple staging/prod instances on one host, alternate edge ports)
land in Coolify's env tab. Defaults preserve current behaviour.
When the active layout switches, cells that exist in both old + new (same
camera, same URL, same HTML) now slide + scale from their old screen
position to the new one over 350ms (ease-out cubic). Fresh cells fade in;
removed cells fade out where they were.
Implementation:
- Each cell widget gets a stable widget_name (cam:<id>:<selector>,
web:<url>, html:<hash>) so old/new can be matched.
- Before swap, capture each cell's bounds + a WidgetPaintable snapshot.
- New grid wrapped in an Overlay; a Fixed ghost layer hosts the animated
Picture widgets driven by add_tick_callback + ease-out cubic.
- Once the window finishes the animation timer, the overlay is unwrapped
back to a plain grid so subsequent renders don't accumulate layers.
LICENSE.md states AGPL-3.0-only OR Commercial dual license (matches the
SPDX expression in every package.json + Cargo.toml). LICENSE-AGPL.txt
is the canonical FSF text. LICENSE-COMMERCIAL.md covers when a
commercial license is required and how to obtain one.
Kiosk now exposes :18090 with two surfaces:
- GET /local/layout/:id?key=<kiosk_local_key>
Bookmark-friendly layout switch on this kiosk. Auth = kiosk-generated
local key (32 random bytes, hex, stored at <state_dir>/local.key).
- ANY /proxy/* — forwards to BF server with the request's Authorization
header preserved. Lets LAN clients reach a cloud-hosted BF server via
the kiosk's local socket; kiosk adds no auth of its own.
Heartbeat reports {local_key, local_port}; kiosks table grows
local_key/local_port/local_last_ip columns. Admin kiosk edit page now
shows the local URLs as a copy-paste block.
Override port: BF_KIOSK_LOCAL_PORT. Disable: BF_KIOSK_LOCAL_DISABLE=1.
Adds an OS + dist upgrade step before the BetterFrame install logic so
re-running the script keeps the host current. Uses
--force-confdef --force-confold
so package maintainer scripts never block on prompts, and follows with
autoremove + autoclean. Kernel/libc updates set /var/run/reboot-required
which the existing REBOOT_NEEDED guard picks up → auto-reboot at end.
BF_SKIP_UPGRADE=1 bypasses the upgrade for fast iteration.
Web and HTML cells were rebuilt + reloaded on every layout switch,
losing JS state and incurring a full page load each time. Mirror the
camera pool: hold WebViews in WARM_WEBVIEWS keyed by URL (or hash of
inline HTML), reuse on switch-back, unparent + cool on switch-away,
drop after the same cooling timer. Identical content in two layouts
shares one WebView.
Three related fixes:
1. Idle reverts (and any other kiosk-initiated layout switch) now POST
layout.changed to /api/kiosk/event. Previously the server only emitted
on admin-initiated switches, so Node-RED never saw the idle revert.
2. Server's /api/kiosk/event splays the payload to the top level when
the topic has a dedicated trigger node (layout.changed, kiosk.changed,
kiosk.status, display.power.changed, camera.changed). The trigger
nodes expect flat shapes matching the admin emit; the old wrapped
shape left every field undefined.
3. Auto-provisioning of bf-server-config in Node-RED: extend retry
window to ~5 min, log per attempt, force v2 API + full-deploy header
so credentials inline get accepted, surface response body on failure.
Pool was keyed by camera_id, so a cell flipping M→S tore down the old
pipeline and started fresh. With (camera_id, badge) keys the main and
sub variants live alongside each other: switching badge promotes the
new one to Warm and leaves the previous one to cool down via the normal
state machine, so flipping back inside the cooldown is instant.
ensure_warm no longer touches sibling badge entries. recompute_global_
state computes warm/hot sets as (cam, badge) pairs by calling
pick_stream per cell with its area fraction, so the planner sees what
ensure_warm will actually create.
Two ergonomics fixes so one invocation does the right thing:
1. After git pull, re-exec the script if the installer itself changed
in the pull. Previously you'd need a second run to pick up new
logic. BF_REEXEC=1 guard prevents loops.
2. Track REBOOT_NEEDED when cmdline.txt / config.txt get edited or
/var/run/reboot-required appears (apt kernel/libc update). At end
of run, auto-reboot after a 10s cancellable window. Override with
BF_NO_REBOOT=1.
WebKitGTK launches bubblewrap for its web-content process; bwrap refuses
to run when the parent process still carries unexpected CAP_* bits ("but
not setuid, old file caps config?"). Setting CapabilityBoundingSet= +
AmbientCapabilities= empty and NoNewPrivileges=yes gives bwrap a clean
caps slate to drop from, so the sandbox initialises and web/dashboard
cells render instead of crashing the kiosk.
Kiosk heartbeat reports local display positions so the server can sync physical outputs without consuming global display indices.
Migrate displays.index away from global uniqueness because display numbering is only meaningful within a kiosk.