Pi-gen username/password config triggers firstboot wizard AFTER
custom stages — reinstalls userconf and undoes our purge. Removed
those params from pi-gen-action config. Now create bfadmin user
directly in chroot script with password expiry on first login.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Transparent cursor theme: 1x1 pixel Xcursor for every shape, set as
system default via XCURSOR_THEME=betterframe-empty. Nuclear fix for
Pi 5 GPU ignoring XCURSOR_SIZE.
2. Full VT lockdown: mask ALL gettys (tty1-6 + templates), logind
NAutoVTs=0 + ReserveVT=0, mask emergency/rescue targets. Ctrl+Alt+Fx
reaches nothing. No login screen ever.
3. Auto-reboot: FailureAction=reboot-force + StartLimitAction=reboot-force
on kiosk unit. If cage/app can't stay running → system reboots rather
than showing a blank screen or login prompt.
4. Purge ALL Pi setup wizards: piwiz, userconf-pi, rpi-first-boot-wizard,
initial-setup, pi-greeter, rpd-plym-splash. Nuke autostart files,
mask systemd units. "Configure your Raspberry" never shows.
losetup -fP partition scanning failed on CI runner ("failed to open
partition 1"). Rewrite to parse partition start/size from sfdisk -J
(JSON output) via jq, then dd with skip/seek at exact sector offsets.
Only uses losetup for individual file images (selector.vfat, rootfs,
bootfs) where partition scanning isn't needed.
Also: add jq to CI apt install, drop xz compression from -9 to -6
(faster, still ~85% ratio on rootfs), free source image earlier to
avoid disk exhaustion on runners with tight scratch.
`secrets` context isn't available in step-level `if:` expressions inside
a reusable workflow_call. Move the secret-presence check to job-level
env (HAS_RAUC_SECRETS, HAS_AUTOIMPORT) and reference those in step if:.
Two image-side hardening pieces both small enough to ship together.
deploy/nftables/nftables.conf — single ruleset installed at /etc/nftables.conf.
Default-drop input. Allowed: loopback, established/related, ratelimited
ICMP, kiosk local API :18090 from RFC1918 / RFC4193 / link-local sources
only. SSH stays gated by sshd-disabled (image build sets enable-ssh: 0
and 01-run-chroot masks it); the firewall rule for :22 is left commented
in for triage scenarios. Forward dropped. Output left wide open — kiosk
needs to dial out to arbitrary RTSP cameras + the BF server (which may
live on the public internet) without explicit allowlisting.
deploy/systemd/betterframe-firstboot.{service,sh} — runs once per device
before betterframe-kiosk starts. Generates a 24-char unambiguous-glyph
password, applies via chpasswd, stores at /etc/betterframe/admin-password
(0400 root), and prints a banner to tty1 so an HDMI-attached operator
can transcribe it during the boot window before cage takes over the
screen. Marker at /var/lib/betterframe/.firstboot-complete prevents
re-run on subsequent boots. Without this, every kiosk built from the
same image shipped with bfadmin:betterframe — a single password leak
compromises the entire fleet.
Future follow-up: post the rotated password (encrypted with cluster_key)
to the BF server via heartbeat so admin UI can surface it. Not in this
commit; the local file + tty banner are the only retrieval paths today.
Phase 2b. Bake the runtime side of RAUC into the curated image so a
freshly-flashed kiosk can accept .raucb bundles immediately:
- Add `rauc` + `dosfstools` to the apt package list.
- Drop deploy/rauc/system.conf to /etc/rauc/system.conf (already declares
the A/B slot layout that repartition-image.sh produces).
- Drop deploy/rauc/betterframe-rauc-boot.sh to
/usr/local/sbin/betterframe-rauc-boot.sh — the custom bootloader
backend that flips the BF_BOOTSEL autoboot.txt to point at the
freshly-installed slot via Pi 5 tryboot.
- Drop deploy/rauc/ca-cert.pem (operator-supplied, committed) to
/etc/rauc/keyring.pem so rauc can verify CMS signatures. If the cert
isn't committed yet, image build emits a workflow warning and the
kiosk image installs but refuses every bundle — image still flashes,
just no OS OTA until the cert is committed.
- Enable BF_ENABLE_OS_OTA=1 in /etc/default/betterframe-kiosk so the
kiosk Rust consumer actually polls for bundles. Set to 0 to pin OS
version for a specific kiosk.
mark-good was already wired (deploy/systemd/betterframe-rauc-mark-good.{service,sh}).
The kiosk's heartbeat loop also calls `rauc status mark-good` as a
belt+suspenders backup; both are idempotent.
Phase 2a of OS OTA: post-process pi-gen output into a RAUC-compatible
A/B layout. New deploy/rauc/repartition-image.sh:
- Decompresses the stock pi-gen 2-partition image
- Extracts bootfs (vfat) + rootfs (ext4) blobs
- Compacts rootfs with resize2fs -M and grows back with 25% headroom
- Patches /etc/fstab inside rootfs to use LABEL=BF_BOOT_A /
LABEL=BF_ROOT_A / LABEL=BF_DATA (slot-agnostic; RAUC re-labels per
slot on install)
- Stamps /etc/betterframe/{os-version,os-compatibility} for the kiosk's
os_update.rs to read at runtime
- Builds two bootfs copies, each with cmdline.txt root= rewritten to
the matching ROOT slot
- Lays out 6 GPT partitions: BF_BOOTSEL (autoboot.txt with tryboot
pointing at boot_partition=2 / [tryboot] boot_partition=3), BF_BOOT_A,
BF_BOOT_B, BF_ROOT_A (populated), BF_ROOT_B (empty, RAUC fills on
first install), BF_DATA
- Recompresses with xz -T0
build-bundle.sh now takes the already-extracted slot images so the
.raucb bundle re-uses the exact same blobs that ship inside the A/B
initial-flash image — no duplication, no drift.
CI wires the repartition step between pi-gen output and the GitHub
Release upload. Ships the A/B image (not the stock pi-gen one).
Also: bump Blacksmith binary builders from 2/4 vCPU to 8 vCPU each.
Image job stays on GitHub's ubuntu-24.04-arm (Blacksmith arm kernel
6.5 doesn't ship binfmt_misc as a loadable module, which pi-gen-action's
defensive modprobe step still requires).
What's still pending:
- In-image RAUC install (rauc package + drop system.conf + CA cert
at /etc/rauc/keyring.pem). Without this, the image boots A/B-laid-
out but rauc install commands have no daemon to talk to.
- Admin UI for OS releases + rollouts (task #4).
Phase 1 of the OS OTA pipeline. Three pieces:
scripts/gen-rauc-signing-keys.sh — one-shot helper that issues an
Ed25519 X.509 CA + signing cert pair. Operator runs locally, commits
the CA cert (for embedding in kiosk image at /etc/rauc/keyring.pem),
stores the signing pair as GitHub Actions secrets
(BF_RAUC_SIGNING_CERT + BF_RAUC_SIGNING_KEY), keeps the CA private
key offline. RAUC verifies bundles against the keyring in the image.
deploy/rauc/build-bundle.sh — takes the pi-gen .img.xz, parses its
partition table with sfdisk, dd-extracts bootfs (vfat) + rootfs
(ext4) into a staging dir, renders manifest.raucm.in with version
+ git sha, runs `rauc bundle --cert= --key=` to produce a signed
.raucb. Verifies the bundle round-trips with `rauc info`.
build.yml gains two gated steps:
- "Build RAUC bundle": runs only when both signing secrets are set,
uploads .raucb as a release asset alongside the .img.xz.
- "Auto-import OS bundle into BF server": POSTs the GH release asset
URL to ${BF_AUTOIMPORT_URL}/api/admin/os/import so the server
pulls + stores the bundle. Mirrors the kiosk-binary auto-import
flow that already worked.
Compatibility string is `betterframe-rpi5-aarch64` (matches the value
already declared in deploy/rauc/system.conf). Channel passed through
from inputs (dev for master pushes, stable/beta for tags).
What's NOT in this commit:
- Pi image A/B partition layout (custom genimage / pi-gen patch)
- rauc package install + keyring drop in pi-gen stage
- Kiosk-side os_update.rs Rust consumer that polls /api/kiosk/os/check
- Admin UI for releases + rollouts
A bundle built today reaches /api/admin/os/import on the server but
isn't installable yet — kiosks have no consumer and no A/B layout.
That's the next 3 phases. Bundle production needs to be solid first
so the kiosk side can be tested against real artifacts.
Kiosks running our pre-built image (managed_image=true at pairing) can
have their hostname, timezone, network (DHCP/static + VLAN), and Wi-Fi
configured from the admin UI. Pull-model: server stores desired-state
JSON, kiosk heartbeat returns pending_config when version exceeds
applied_version, kiosk echoes applied_version back. Wi-Fi PSK encrypted
with the cluster key so ciphertext at rest is shipped to the kiosk
without per-kiosk re-encryption.
Server side only — kiosk Rust applier (betterframe-apply-config helper
+ rollback timer) and pair-initiate marker file are next.
ci(pi-gen): use action's image-path output for asset upload
pi-gen writes the .img.xz into pi-gen-action's own working dir, not our
repo deploy/. Glob never matched. Use steps.pigen.outputs.image-path
directly — no glob needed.
Pi-gen doesn't auto-copy a sub-stage's files/ dir into the chroot. The
chroot script's install commands were reaching for /tmp/bf-files/... which
never existed. Add a host-side 00-run.sh that bulk-copies files/* into
ROOTFS_DIR/tmp/bf-files, then rename the chroot script to 01-run-chroot.sh
so it sorts AFTER the host copy ('-' < '.' bites you otherwise).
usimd/pi-gen-action#179: Trixie + QEMU breaks on x86 runners (arch-test
"arm64: not supported"). Native arm64 runner means no qemu, no binfmt
registration dance — pi-gen runs the chroot directly. Faster too.
tonistiigi/binfmt registers /usr/bin/qemu-aarch64 (dynamic). Even with F-flag
preload, qemu still dlopen's its libs at exec time — fails inside pi-gen's
chroot. Debian's qemu-user-static ships /usr/bin/qemu-aarch64-static and
post-install sets F flag automatically. Pi-gen's dependencies_check needs
the static path.
Pi-gen container exits in 0.288s after image build with no logs printed.
Default action input verbose-output=false suppresses pi-gen output;
flipping to true should show what build.sh trips on inside the container.
Log diagnosis on run 26130391965:
##[error]The process '/usr/bin/sudo' failed with exit code 100
Failure was inside the action's 'Installing build dependencies on host'
step. extra-host-dependencies: qemu-user-static binfmt-support broke
apt — possibly conflicting locks or the action's input handling.
tonistiigi/binfmt --install arm64 already registered qemu-aarch64 with
'flags: POCF' (F = kernel-resident static binary). That's enough; no
need for the inside-container qemu packages.
Host-side tonistiigi/binfmt registration doesn't propagate into the
pi-gen-action's nested Docker container's view of /proc/sys/fs/binfmt_misc.
usimd/pi-gen-action's extra-host-dependencies input runs apt-get inside
the pi-gen container before pi-gen launches — install qemu-user-static
+ binfmt-support there so the chroot's arm64 binaries can execute.
apt's qemu-user-static + update-binfmts produces a registration that
pi-gen's nested Docker container still couldn't see. Switch to the
canonical tonistiigi/binfmt approach: privileged container that
installs QEMU statically with the F (fix-binary) flag, so the kernel
opens the qemu-aarch64-static binary at registration time and uses it
for all subsequent arm64 execs — independent of which container the
exec happens in.
Plus diagnostic: ls /proc/sys/fs/binfmt_misc + cat qemu-aarch64
detail, so next run's log surfaces whether registration actually
landed.
docker/setup-qemu-action registers binfmt via a privileged side container;
pi-gen-action's own nested Docker container doesn't inherit the
registration. Result: arm64 ELFs in the pi-gen chroot still fail to
exec, exit 1 before any stage runs.
apt-installed qemu-user-static + binfmt-support writes persistent
binfmt_misc entries to the kernel that propagate to every container
share. Pair with update-binfmts --enable qemu-aarch64 and a sanity
ls -la /proc/sys/fs/binfmt_misc/qemu-aarch64.
Real cause of last pi-gen failure was surfaced by verbose-output:
WARNING: Only a native build environment is supported.
arm64: not supported on this machine/kernel
ubuntu-latest is x86_64; pi-gen builds an arm64 image and chroots into
it during stages, requiring binfmt_misc handlers for arm64. Add
docker/setup-qemu-action before the pi-gen step.
While here, audit + bump every action version (pinned to current
majors):
actions/checkout v4 → v6
actions/upload-artifact v4 → v7
actions/download-artifact v4 → v8
softprops/action-gh-release v2 → v3
docker/setup-qemu-action @v4 (new)
usimd/pi-gen-action @v1 (already current major)
dtolnay/rust-toolchain @stable (rolling channel — keep)
Reverts misdiagnosis. pi-gen defaults to trixie since the Debian 13
release, which has gtk4 4.14 + libwebkitgtk-6.0 stock — no backports
needed. Build container, kiosk gtk feature gate, and pi-gen target all
realigned to trixie.
Actual reason last image run failed: our custom stage was missing the
mandatory prerun.sh (pi-gen calls it to seed ROOTFS_DIR from the
previous stage) and the EXPORT_IMAGE marker file (signals 'bake an
image at the end of this stage'). Both added.
Asset upload now globs deploy/*.img.xz so any extra exports stage2
produces ship alongside our customised one.
Replaces release-kiosk.yml + release-image.yml with two coupled workflows:
release.yml — entrypoint. Computes version/channel/tag:
- master push → semver patch bump from latest stable tag, append
-dev.<shortsha>, create lightweight tag + prerelease record
- v* tag push → use tag verbatim, channel from suffix (-beta./-dev. or
stable), create release if missing
Then invokes build.yml via uses: ./.github/workflows/build.yml.
build.yml — reusable (workflow_call). Single source of truth for asset
production:
- kiosk binary matrix (aarch64, x86_64) in debian:trixie-slim
- flashable .img.xz via pi-gen using the aarch64 artifact (gated by
build-image input; master pushes default false to keep dev cycles
fast, tag pushes default true for a full release)
Both jobs attach to the release at tag_name=${{ inputs.tag }}.
Concurrency: master-branch runs cancel superseded peers; tag runs never
cancel. CI auto-import to a running BF server (BF_AUTOIMPORT_URL +
BF_AUTOIMPORT_API_KEY repo secrets) still wired.