Server bridge was forwarding to raw topic paths that no Node-RED node
listens on. Now forwards to fixed routes: camera.event, onvif.event,
onvif.motion, onvif.anpr — matching what trigger nodes register.
ONVIF XML parser now extracts Key section SimpleItems (PlateNumber,
etc.) into the data map alongside Data section items. Previously only
parsed Source and Data, missing Key-section fields like plate numbers.
Node-RED trigger nodes: camera_id filter changed from Number() to
String() comparison for UUIDv7 compatibility.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Track subscribed_at timestamp per camera in SubStatus
- Fix mark_event_received to use epoch seconds (was OS version string)
- needs_refresh() returns true when any sub is failed/stopped or >24h old
- Heartbeat loop calls maybe_refresh_onvif() every 60s — reloads
cameras from cached bundle and restarts onvif_events::start() which
kills old generation threads and creates fresh PullPoint subscriptions
- mark_event_received called on each successful event forward
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Event insert: if source_camera_id FK fails (stale kiosk sending old
integer IDs), retry with camera_id=NULL. Event still logs, just
without camera association. Stops 500 spam until kiosk updates.
- Kiosk cleanup on first healthy boot: remove stale OS update staging
files (>24h old) from /var/lib/betterframe/tmp/, and old firmware
.prev binaries (>7 days) from /opt/betterframe/kiosk/.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- All bundle struct ID fields (kiosk_id, display_id, layout_id,
camera_id, stream_id, gpio_id) now String with de_flexible_id
deserializer accepting both JSON numbers and strings.
- PoolKey, DisplayState hashmap, WorkerMsg, ServerMsg all use String
IDs throughout. Zero u32 ID references remain.
- ONVIF event image proxy: kiosk detects PictureUri in event data,
downloads image from camera (basic/digest auth), base64 encodes,
attaches to event payload before forwarding to server.
- Add md5 crate for HTTP Digest auth on camera image fetch.
- ws_client: flexible_id_from_value helper for WS message ID parsing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add betterframe-expand-data systemd service: growpart + resize2fs on
BF_DATA (last partition) so it fills the full SD card on first boot.
Solves the "No space left on device" issue with OS update downloads.
- Change OS update staging dir from /var/tmp/betterframe to
/var/lib/betterframe/tmp (on BF_DATA partition, not rootfs).
- Wire firmware and OS update progress callbacks into the GTK overlay
banner — shows "OS Update v1.2.3: Downloading — 45%" etc.
- Add per-partition disk reporting in heartbeat (/, /boot/firmware,
/var/lib/betterframe) with total/used/free/percent.
- Display partition table on kiosk detail page in admin UI.
- PG + SQLite migrations for partitions_json column on kiosks.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Added WorkerMsg::UpdateProgress(Option<(label, percent)>) for
showing firmware/OS update progress as an overlay banner on the
display. Handler + label management in place. Actual progress
reporting from firmware.rs/os_update.rs to be wired next.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Version label was only on pairing screen. Now also shown on the
idle/awaiting-layout logo screen (bottom-left overlay).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
SOAP errors now extract fault Reason/Text/Code from XML instead of
dumping raw envelope. Logs whether ONVIF password was decrypted
(has_pass=true/false). Added NTP config to pi-gen (pool.ntp.org +
Google/Cloudflare fallback) — WSSE PasswordDigest fails with clock
skew.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ONVIF: only subscribe to cameras actually in layout cells, not all
bundle cameras. Purge warm camera pool entries for cameras removed
from the bundle entirely — immediate stop, no cooling period.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Credentials embedded in RTSP URL can skip digest negotiation on
some cameras. Now extract user:pass from URL, set as user-id/user-pw
properties on rtspsrc, pass clean URL as location.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Version label at bottom-left of pairing screen shows firmware
version (compile-time) and OS version (from /etc/betterframe/
os-version). Spinner removed from pairing screen per request.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After _check confirms key still valid, code fell through to parse
the bf_kiosk_deleted JSON as a KioskBundle causing parse error.
Now returns None to skip bundle processing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
HTTP 400 from camera gave no detail. Now includes first 500 chars
of response body in error message so Axiom shows the actual SOAP
fault reason.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
kiosk_id only set during claim. After restart, loaded from disk,
kiosk_id stayed empty. Now set from bundle.kiosk_id after fetch.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Kiosk checks for stable firmware update before pairing. If available,
downloads + verifies + swaps binary and restarts. No auth needed.
Server: GET /api/firmware/public/check (stable channel, no auth)
GET /api/firmware/public/download/:id (rate-limited, no auth)
Kiosk: check_public() + apply_public() in firmware.rs. Called from
ui.rs worker thread before entering pairing loop. kiosk_app_version
made pub for access from ui.rs.
Also includes kiosk_id deserialization fix (Value instead of String).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Server returns kiosk_id as integer (not yet migrated to UUIDv7).
ClaimResp.kiosk_id changed from Option<String> to Option<Value>
to handle both integer and string. This was causing a panic on
deserialization after successful pairing.
Also simplified Coolify version arg.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Server returns {bf_kiosk_deleted: true} (200) instead of 401 when
kiosk key not found on bundle/heartbeat. Kiosk then confirms via
GET /api/kiosk/_check — only wipes config if _check also returns
401. Prevents proxy glitches from nuking valid kiosks.
Flow: bf_kiosk_deleted signal → confirm via _check → 401 = wipe,
200 = ignore (false alarm).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After successful claim, kiosk_id from server response is stored
globally and included in all subsequent Axiom log entries for
kiosk identification.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Kiosk binary now forwards all tracing logs to Axiom when
BF_AXIOM_KEY + BF_AXIOM_DATASET are set at compile time via
option_env!(). Batches up to 50 entries or flushes every 10s.
No-op when keys not baked in (local dev builds).
CI build.yml passes secrets as env vars for cargo build.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Smart URL actions: multi-step browser automation for web cells behind
login pages. Steps: navigate, fill (form fields), click, wait, wait_for
(element selector), javascript (raw eval). Passwords in fill steps
encrypted with per-kiosk key for transport.
Schema: server/src/schemas/wire/smart-url.ts defines step types.
Stored in layout_cells.options.smart_url (no migration needed).
Bundle: includes smart_url config per cell. Fill step values encrypted
at bundle generation time with per-kiosk key (or cluster key fallback).
Kiosk: execute_smart_url_steps() builds an async JS sequence from the
steps and injects via WebKit evaluate_javascript on LoadEvent::Finished.
Supports session expiry detection via login_detect_url.
Admin UI: step builder TODO (currently configure via cell options JSON).
Data model + kiosk execution + bundle transport are complete.
OS bundle download was buffering 1.2GB in RAM then writing → network
timeout or memory pressure killed it. Now:
Kiosk side:
- Streams directly to /var/tmp/betterframe/ in 256KB chunks
- On network error: resumes from last byte written (Range header)
- Up to 5 retries with 10s backoff between attempts
- Progress logged every ~50MB
- sha256 verified on the complete file on disk (not in memory)
Server side:
- /api/kiosk/os/download/:id supports Range: bytes=N- header
- Returns 206 Partial Content with Content-Range for resume
- streamBundle accepts start/end for partial reads via createReadStream
- Advertises Accept-Ranges: bytes on all responses
Replaces shared cluster_key for bundle encryption. Each kiosk gets a
unique 32-byte AES key generated at pairing time:
Server:
- confirmPairing generates randomBytes(32), stores encrypted with
server secret on kiosks.encrypt_key_encrypted column
- Delivers plaintext encrypt_key to kiosk in claim response (one-time)
- generateBundle prefers per-kiosk key over cluster_key for
encryptForCluster (same AES-256-GCM format, different key per kiosk)
Kiosk:
- ClaimResp gains encrypt_key field, stored encrypted at rest
- onvif_events prefers encrypt_key over cluster_key for decryption
- Backward compatible: old kiosks without encrypt_key still use
cluster_key (both delivered at pairing)
Security improvement: compromised SD card only exposes camera passwords
encrypted for THAT specific kiosk, not the entire fleet. Rotate by
deleting + re-pairing the compromised kiosk.
WebView "URL can't be shown" — Authorization header only applies to
the initial page load. CSS/JS/XHR/WebSocket sub-resources from the
loaded page don't inherit it → Angie auth_request rejects → page breaks.
Kiosk side: set_kiosk_cookie() injects betterframe_kiosk_key cookie
into WebKit's cookie jar via JS bridge before loading the URL. Cookie
persists across all sub-resource requests automatically.
Server side: extractBearerToken() now checks betterframe_kiosk_key
cookie as fallback when no Authorization header present. Same
verifyKioskKey path, just different transport.
Three bugs:
1. std::mem::forget(generation) leaked the Arc → old threads never
stopped on bundle reload. Now stored in a static Mutex; new start()
replaces it → old Arc drops → old Weak::upgrade() returns None.
2. CreatePullPoint Address uses namespace prefix (wsa5:Address,
a:Address, etc.). Parser only matched plain <Address>. New
extract_tag_ns tries common prefixes + fallback regex scan.
Also validates address starts with "http" and logs response
preview on failure for debugging.
3. Pull failure → immediate resubscribe with no delay → hammers camera.
Added 15s backoff after pull failure before resubscribe.
Full ONVIF event management overhaul:
DB: cameras gain event_source (auto|server|kiosk:<id>), event_sink
(auto|server|kiosk:<id>), and supported_event_topics (JSON array).
Server:
- GetEventProperties SOAP call in onvif.ts — queries camera for all
supported event topics (motion, ANPR, line crossing, etc.)
- POST /admin/cameras/:id/refresh-events route — runs GetEventProperties
via designated event source (kiosk WS relay or server direct)
- Camera edit form: event_source + event_sink dropdowns
- Camera detail: supported event topics table with refresh button
- Bundle includes event_source + event_sink so kiosk knows its role
Kiosk:
- onvif_events.rs respects event_source: only subscribes when "auto"
or "kiosk:<this_id>", skips when "server"
- Subscription status tracking: state (subscribing/active/failed),
last_event_at, error — reported in heartbeat for admin visibility
- BundleCamera gains event_source + event_sink fields
Auto logic for source: camera in kiosk's bundle → kiosk subscribes.
Auto logic for sink: TODO — same-subnet detection for WSBaseNotification.
Currently PullPoint only; push model is the next step.
The kiosk runs under NoNewPrivileges=yes (WebKit bwrap needs it). sudo
and nsenter both fail because they need privilege escalation which the
flag blocks. systemd-run --pipe spawns a SEPARATE service unit as root
in its own process tree, connected via stdin/stdout pipe. Not a child
of the kiosk process → NoNewPrivileges doesn't apply.
Also: enable rauc.service in pi-gen chroot (was never enabled → RAUC
daemon not running → rauc install fails → OS update silently broken).
Terminal spawns bash as bfkiosk (unprivileged) → can't read journal,
can't run rauc/systemctl, can't fix anything useful. Now runs
sudo bash --login (with fallback to plain bash if sudo unavailable).
Journal streaming: sudo journalctl instead of plain journalctl so
bfkiosk can read system journal without systemd-journal group.
Pi-gen image: drops /etc/sudoers.d/betterframe-kiosk granting bfkiosk
passwordless sudo. Gated by the on-screen code + lockout ladder, so
root access still requires physical presence.
Root cause: kiosk never stored cluster_key from pairing response.
Bundle ships onvif_password_encrypted (AES-256-GCM with cluster key).
decrypt_cluster was a stub returning None → empty password → WSSE auth
fails → CreatePullPoint rejected → no events ever.
Fix:
1. ClaimResp now includes cluster_key field
2. Stored encrypted at rest alongside kiosk_key (at_rest.rs)
3. Loaded at bundle render, passed to onvif_events::start()
4. decrypt_cluster implements full AES-256-GCM: parse v1.<iv>.<tag>.<ct>
format, base64url decode, decrypt with cluster key
Also: removed BF_ENABLE_ONVIF_EVENTS env gate — if camera is type=onvif
with onvif_host, subscribe. Gate was redundant with the type filter.
Also: bump Angie proxy_read_timeout to 600s on /api/admin/ for OS
bundle import (downloads ~1GB from GitHub, was timing out at 60s).
NOTE: existing paired kiosks won't have cluster_key stored. They need
to re-pair (delete + re-add) to receive it. New pairings get it
automatically.
Terminal: idle_add_local_once from non-GTK thread silently fails.
Forward ShowTerminalCode/DismissTerminalCode through WorkerMsg channel
which IS polled on the GTK main thread via timeout_add_local.
Journal: try --user-unit first, fall back to unfiltered journal if
permission denied (bfkiosk user may not be in systemd-journal group on
non-reflashed images). Send error line back to admin UI on spawn failure
instead of silent drop.
Three fixes:
1. Terminal code overlay replaces the main display window's child instead
of creating a new gtk::Window (cage compositor only shows one window).
Saves the previous child and restores on dismiss.
2. Code auto-expires after 60s — timeout does NOT increment lockout.
GTK overlay dismissed + pending_code cleared.
3. Journal-start handler already logs but relay might fail silently if
kiosk WS reconnected after admin debug WS connected.
Kiosk side (remote_debug.rs + ws_client.rs refactor):
- Journal streaming: server sends journal-start → kiosk spawns
journalctl -f, pipes lines back as journal-line messages via WS.
journal-stop kills the process. On-demand, not always-on.
- Terminal: server sends terminal-request → kiosk checks lockout +
firmware_channel == "dev" → generates 8-char code displayed on
screen as fullscreen overlay (NOT logged) → server relays admin's
code via terminal-auth → kiosk validates with constant-time compare
→ on success spawns bash, relays I/O as base64 terminal-data.
- Lockout: 3 failed codes per boot → lockout_count++. 3 lockouts
(9 total failures) → permanent (reflash only). Reboot resets
attempt counter, not lockout counter. Successful pairing resets all.
- ws_client.rs rewritten with split reader/writer + tokio::select!
for multiplexing incoming WS messages with outbound journal/terminal
data from sync threads.
Server side (coordinator-ws + routes-admin):
- New admin debug WS endpoint: /ws/admin/debug/:kioskId. Authenticated
via admin API key (query param) or session cookie. Relays messages
bidirectionally between admin browser ↔ kiosk.
- Admin pages: /admin/kiosks/:id/logs (journal viewer with start/
stop/clear) and /admin/kiosks/:id/terminal (code entry + terminal
area). Both open in new tabs from the kiosk detail page.
- Angie proxy config updated with /ws/admin/debug/ location block.
Security:
- Terminal only on dev channel
- Code displayed physically on screen, never logged or stored server-side
- Lockout: 3/boot, 3 lockouts = permanent, pairing resets
- Kiosk responds "locked" without specifying which lockout triggered
/opt/betterframe/kiosk/ now owned bfkiosk:bfkiosk so OTA can write
.new/.prev files. Marker path in Rust code aligned with rollback
script expectation (/var/lib/betterframe/kiosk/firmware-applying.json).
New kiosk/src/onvif_events.rs: for each ONVIF camera in the bundle,
creates a PullPoint subscription, polls every 3s, parses
NotificationMessage XML into structured JSON (topic + source key/values
+ data key/values + timestamp), and POSTs to /api/kiosk/event with
source_type=onvif + camera_id.
Forwards ALL event topics: motion, ANPR (LicensePlateRecognition),
line crossing, intrusion, digital input, analytics, tamper — everything
the camera exposes. Node-RED sorts what matters.
Subscription lifecycle:
- CreatePullPointSubscription with 60s InitialTerminationTime
- Renew every 55s before timeout
- Unsubscribe on bundle change / shutdown
- Auto-resubscribe on pull/renew failure (30s backoff)
- Generation tracking via Weak<()> so old workers self-terminate
when start() is called with a new bundle
WSSE PasswordDigest auth for SOAP calls — same scheme the server's
onvif.ts uses. sha1 crate added.
BundleCamera extended with onvif_host/port/username/password_encrypted
fields (server already ships them; kiosk just wasn't deserializing).
Gated by BF_ENABLE_ONVIF_EVENTS=1. Enabled by default in the pi-gen
image env file.
TODO: cluster-key-based decryption of onvif_password_encrypted. For
now relies on the RTSP URI having plaintext credentials embedded (which
the ONVIF import path already ensures via rtspWithCredentials).
New module kiosk/src/at_rest.rs. Derives an AES-256-GCM key via HKDF
from a Pi-bound value:
1. /proc/device-tree/serial-number (Pi 5 firmware exposes it)
2. /proc/cpuinfo Serial line (older kernels)
3. /etc/machine-id (non-Pi dev fallback)
File format: "BFE1" magic || 12-byte random nonce || ciphertext+tag.
Atomic write via tempfile + rename so a crash mid-write can't leave a
half-encrypted file.
Wired into kiosk/src/server.rs at every file I/O touching sensitive
state:
- kiosk.key (bearer token to BF server)
- local.key (LAN-side API auth key)
- bundle.json (cached bundle with RTSP credentials in URL form)
Migration: read paths tolerate legacy plaintext (kiosks upgraded from a
pre-at_rest build) AND re-store as ciphertext on the first read. One-
shot upgrade — subsequent boots skip the migration write.
Threat model defended: SD card extraction. Attacker who pulls the card
can't decrypt without also having the same physical Pi (CPU serial is
hardware-bound). Doesn't defeat an attacker who has both — at that
point they ARE the kiosk. Bar is raised from "trivially extract every
camera password" to "must steal the device intact."
Not defended: TPM-style attestation, remote attestation, sealed boot.
Pi 5 has no TPM and we don't ship a secure-boot config.
Tests in-module: round-trip short bytes, round-trip JSON, legacy
plaintext passthrough.
Phase 3 of the OS OTA pipeline. New module kiosk/src/os_update.rs polls
/api/kiosk/os/check with the kiosk's compatibility string and current OS
version (read from /etc/betterframe/os-compatibility +
/etc/betterframe/os-version, both written by the image build), downloads
the bundle, sha256-verifies the transport, and hands off to
`rauc install`. RAUC takes it from there: CMS signature verify against
/etc/rauc/keyring.pem, copy into inactive A/B slot, arm tryboot via the
custom bootloader backend, return. We then post /api/kiosk/os/applied
and `systemctl reboot` into the new slot.
Wired into the existing 60s heartbeat loop in ui.rs, gated by
BF_ENABLE_OS_OTA=1 (default OFF so dev kiosks on non-A/B images don't
keep trying + failing). Runs BEFORE the kiosk-binary check on each tick
so an OS bundle that ships an updated kiosk binary doesn't race the
firmware path.
On clean-boot heartbeat success we now also call `rauc status
mark-good` so the boot-attempts counter resets — three bad boots in a
row will auto-roll back without us needing a separate rollback path.
What's NOT in this commit:
- A/B partition layout in the pi-gen image (task #6, blocks actual
deployment — bundles can be served + accepted but `rauc install`
will refuse without two valid slots).
- Admin UI for managing releases + rollouts (task #4).