Commit Graph

224 Commits

Author SHA1 Message Date
Dorian
64b57dca7d fix: overhaul container lifecycle — recovery, health, uninstall, UI state
Some checks failed
Build Archipelago ISO (dev) / build-iso (push) Failing after 13m44s
Container Orchestration Tests / unit-tests (push) Failing after 7m30s
Container Orchestration Tests / smoke-tests (push) Has been skipped
Container recovery:
- Health monitor: MAX_RESTART_ATTEMPTS 3→10, interval 60s→120s
- Dependency-aware restarts: won't restart services before their deps
- Reset dependent counters when a dependency recovers
- Handle "created" state containers (were invisible to health monitor)
- Added IndeedHub, mempool-api, mysql to tier system
- Crash recovery: podman start timeout 30s→120s with retry
- Podman client: socket timeout 5s→30s, added restart policy

UI state representation:
- Exit code 0 shows "stopped" (gray), not "crashed" (red)
- Exit code 137 shows "killed (OOM)"
- Non-zero exit shows "crashed" (red)
- Added exit_code field to PackageDataEntry

Install/uninstall fixes:
- Install returns error when container doesn't start (was silent success)
- Post-install hooks awaited instead of fire-and-forget tokio::spawn
- Uninstall: graceful rm before force, volume prune, network cleanup
- Uninstall returns error on partial failure (was 200 OK)

Config consistency:
- DB passwords read from /var/lib/archipelago/secrets/ (was hardcoded)
- Bitcoin: added ZMQ ports 28332/28333 for LND block notifications
- IndeedHub port 7777→8190 (was conflicting with strfry)
- Marketplace versions: LND 0.17.4→0.18.4, Mempool 2.5.0→3.0.0

Performance:
- Metrics collector interval 60s→300s (was duplicating health monitor)
- Podman client: proper error propagation instead of unwrap_or_default

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 07:03:57 +01:00
Dorian
cdff10a8bc fix: retry Tor address discovery in background after startup
Some checks failed
Build Archipelago ISO (dev) / build-iso (push) Failing after 39m32s
Container Orchestration Tests / unit-tests (push) Successful in 32m21s
Container Orchestration Tests / smoke-tests (push) Successful in 6m2s
Backend reads Tor address once at startup. If Tor hasn't started yet,
the address is null forever until restart. Now retries at 5, 10, 20,
30, 60 seconds in a background task until Tor is available.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 05:11:55 +01:00
Dorian
a8292ab622 feat: BIP-39 master seed for unified key derivation
Some checks failed
Build Archipelago ISO (dev) / build-iso (push) Failing after 17m51s
Container Orchestration Tests / smoke-tests (push) Has been cancelled
Container Orchestration Tests / unit-tests (push) Has been cancelled
Replace fragmented random key generation with a single 24-word BIP-39
mnemonic that deterministically derives all node keys: Ed25519 (DID),
secp256k1 (Nostr/Bitcoin), BIP-84 xprv (Bitcoin Core), and LND aezeed
entropy. New onboarding flow: seed generate → word verification → identity
naming. Restore path enabled via 24-word entry. Includes seed RPC handlers,
mock backend support, LND/Bitcoin Core wallet-from-seed integration, and
UI polish across settings and discover views.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 01:41:24 +01:00
Dorian
3d50fb9888 feat: auto-detect and enable mesh radio on startup
Some checks failed
Build Archipelago ISO (dev) / build-iso (push) Failing after 13m3s
Container Orchestration Tests / unit-tests (push) Failing after 8m3s
Container Orchestration Tests / smoke-tests (push) Has been skipped
When no mesh config exists (fresh install), scan for serial devices
at /dev/ttyUSB* and /dev/ttyACM*. If a radio is found, auto-enable
mesh and save the config so subsequent boots connect immediately.

Previously, mesh defaulted to disabled and the radio was never probed
unless the user manually created a mesh-config.json file.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 00:50:43 +01:00
Dorian
588ce53833 fix: disk stats show LUKS data partition, not 29GB root
Some checks failed
Container Orchestration Tests / unit-tests (push) Has been cancelled
Container Orchestration Tests / smoke-tests (push) Has been cancelled
Build Archipelago ISO (dev) / build-iso (push) Has been cancelled
system.stats (Home page) and monitoring collector both used df /
which shows the small 29GB root partition. Now prefers
/var/lib/archipelago (the LUKS encrypted data partition) when it
exists — showing the actual 1.8TB storage users care about.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 00:29:58 +01:00
Dorian
031b3c34f4 fix: container installs, Tor, kiosk, GRUB, LUKS display, error messages
All checks were successful
Build Archipelago ISO (dev) / build-iso (push) Successful in 52m6s
Container Orchestration Tests / unit-tests (push) Successful in 29m20s
Container Orchestration Tests / smoke-tests (push) Successful in 6m18s
Critical:
- fix: container installs fail with "statfs: no such file or directory"
  Root cause: NoNewPrivileges=yes in systemd blocks sudo inside backend.
  Fix: use std::fs::create_dir_all + podman unshare chown (no sudo needed)
- fix: Tor services.json never written — \$ARCHY_TOR_DIR escaping bug
- fix: kiosk white screen — increase health wait to 60s, add --disable-gpu

Improvements:
- feat: LUKS encryption badge in Server disk stats (backend detects dm-crypt)
- fix: GRUB theme text scaling on 4:3 monitors — explicit fonts, wider menu
- fix: suppress default Debian MOTD (custom profile.d welcome is enough)
- fix: install error messages now show "Failed to pull/start" instead of
  generic "Operation failed" (middleware.rs allowlist expanded)
- fix: container-tests CI — source cargo env before running tests
- docs: interactive container architecture diagram (HTML)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 16:35:06 +01:00
Dorian
5bd3caf141 fix: auth, container resilience, ISO build, gamepad polish
Some checks failed
Build Archipelago ISO (dev) / build-iso (push) Successful in 57m41s
Build Archipelago ISO / build-iso (push) Has been cancelled
Container Orchestration Tests / unit-tests (push) Failing after 7m14s
Container Orchestration Tests / smoke-tests (push) Has been skipped
- fix: login disconnect — verify session before WebSocket connect
- fix: 403 on app install — distinguish CSRF vs RBAC errors, only retry CSRF
- fix: health monitor now watches ALL containers (removed skip list for
  backend services like nbxplorer, databases, UI containers)
- fix: server.get-state added to CSRF-exempt list (read-only)
- fix: ISO build includes container-specs.sh and lib/common.sh in rootfs
  so reconcile actually works on fresh installs
- fix: gamepad nav — improved Server tab zone nav, focus styles, autofocus
- chore: move L484 web-only apps to Services tab
- chore: install store for cross-view install tracking

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 13:35:02 +01:00
Dorian
9d437ea476 fix: password setup, CSRF 403, reboot after install
Some checks failed
Build Archipelago ISO (dev) / build-iso (push) Successful in 48m51s
Container Orchestration Tests / unit-tests (push) Failing after 5m22s
Container Orchestration Tests / smoke-tests (push) Has been skipped
Critical fixes:
- Remove ensure_default_user() — no more auto-creating user with
  password123. Login page now shows "Create Password" form on first
  boot. User sets their own password during onboarding flow.
- CSRF 403: increased retry delay from 300ms to 500ms for stale
  cookie recovery after remember-me session restore.
- Reboot: multiple fallback methods (/sbin/reboot, sysrq, kill init)
  when USB is pulled and /usr/sbin isn't available.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 22:44:46 +01:00
Dorian
89a9f69a9b fix: CSRF 403 blocking all operations + reboot after install
Some checks failed
Container Orchestration Tests / unit-tests (push) Has been cancelled
Container Orchestration Tests / smoke-tests (push) Has been cancelled
Build Archipelago ISO (dev) / build-iso (push) Has been cancelled
CSRF fix (THE BLOCKER):
- After remember-me session restore, the browser has a stale CSRF
  cookie but a new session token. ALL subsequent RPC calls return 403.
- Fix: exempt read-only polling methods (node-messages-received,
  server.echo, system.stats, tor.status, etc.) from CSRF validation.
  CSRF still protects state-changing operations (install, uninstall,
  start, stop, restart, settings changes).

Reboot fix:
- The separate /tmp/archipelago-reboot.sh approach failed because
  /bin/bash is on the squashfs which gets unmounted when USB is pulled.
- Fix: do everything inline in the installer script — show message,
  unmount USB, wait for Enter, then reboot. Use sysrq-trigger first
  (kernel-level, doesn't need userspace binaries).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 22:42:09 +01:00
Dorian
37f32f4e54 fix: version display, FileBrowser auto-login, nostr relay, UID mappings
Some checks failed
Container Orchestration Tests / unit-tests (push) Has been cancelled
Container Orchestration Tests / smoke-tests (push) Has been cancelled
Build Archipelago ISO (dev) / build-iso (push) Has been cancelled
Version per build:
- Health endpoint returns "1.2.0-alpha-{git_hash}" using GIT_HASH env
- CI passes git hash to cargo build

FileBrowser auto-login:
- filebrowser-client.ts: include CSRF token + credentials:include
- First-boot: generate random password, store at secrets/filebrowser/
- Set FileBrowser admin password to match after container creation

Nostr relay:
- Use docker.io/scsibug/nostr-rs-relay:0.9.0 (not in our registry)

UID mappings:
- Added electrumx (UID 1000), mysql-mempool, archy-btcpay-db, nextcloud-db

522 tests pass, Rust compiles clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 21:56:38 +01:00
Dorian
5b186da770 fix: container orchestration overhaul — names, errors, Tor, restart
Some checks failed
Build Archipelago ISO (dev) / build-iso (push) Failing after 18m5s
Container Orchestration Tests / unit-tests (push) Failing after 6m2s
Container Orchestration Tests / smoke-tests (push) Has been skipped
Container name resolution:
- New all_container_names() — single source of truth for every app's
  container name variants (bitcoin-knots/bitcoin/bitcoin-core, etc.)
- Covers all historical naming patterns and multi-container stacks

Start/Stop/Restart:
- No more silent failures (let _ = podman...). Every operation logs
  the command, checks exit status, and returns real errors to the UI.
- Restart uses stop+start fallback when podman restart fails
  (handles rootless podman loopback adapter errors)
- "No containers found" error when app doesn't exist

Tor helper:
- Install archipelago-tor-helper.path + .service in rootfs
- Enable the path unit so backend can manage Tor as non-root
- Copy tor-helper.sh to /opt/archipelago/scripts/

Verified: container with proper caps can stop/start/restart cleanly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 19:26:21 +01:00
Dorian
08ddc73c75 fix: auto-build UI containers for Bitcoin, LND, Electrumx
Some checks failed
Build Archipelago ISO (dev) / build-iso (push) Failing after 23m8s
Container Orchestration Tests / unit-tests (push) Failing after 6m27s
Container Orchestration Tests / smoke-tests (push) Has been skipped
Critical: headless services (Bitcoin, LND, Electrumx) need companion
UI containers that serve web dashboards. These were only built for
Bitcoin, and only on bundled ISO builds.

Changes:
- install.rs: auto-build UI containers for LND (port 8081) and
  Electrumx (port 50002) in addition to Bitcoin (port 8334)
- build-auto-installer-iso.sh: always bundle docker UI source files
  (was skipping for unbundled builds — they're tiny HTML, not images)
- Dockerfiles: fix nginx base image tag 1.29.6→1.27.4 (matches registry)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 17:48:13 +01:00
Dorian
0b5fb4c90b fix: fedimint --bitcoind-url CLI arg + data-dir
Some checks failed
Container Orchestration Tests / unit-tests (push) Has been cancelled
Container Orchestration Tests / smoke-tests (push) Has been cancelled
Build Archipelago ISO (dev) / build-iso (push) Has been cancelled
fedimintd v0.10.0 requires --data-dir and --bitcoind-url as CLI args,
not just env vars. Container was exiting with usage error.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 17:28:33 +01:00
Dorian
e8735b39ec feat: TASK-49 container reliability — tests, orchestration, MASTER_PLAN
Some checks failed
Container Orchestration Tests / unit-tests (push) Has been cancelled
Container Orchestration Tests / smoke-tests (push) Has been cancelled
Build Archipelago ISO (dev) / build-iso (push) Has been cancelled
- Add orchestration_tests.rs + mock_podman.rs (container unit tests)
- Add container-tests.yml CI workflow
- Add dev-container-test.sh for local testing
- MASTER_PLAN.md: add TASK-49 (P0) with 6-phase plan
- Login.vue: minor fixes from user testing
- AppCard.vue: enter key handler fix

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 17:15:56 +01:00
Dorian
25b789bd3f fix: Home Assistant NET_RAW cap, container storage on LUKS, NET_BIND for all
Some checks failed
Build Archipelago ISO (dev) / build-iso (push) Has been cancelled
- Home Assistant: add NET_RAW for DHCP discovery (fixes dhcp permission error)
- Nextcloud/BTCPay/Jellyfin/etc: add NET_BIND_SERVICE (was missing)
- Container storage: redirect graphroot to /var/lib/archipelago/containers/storage
  (prevents root partition filling up — was 100% after 6 images on 29GB root)

Tested on .198: 10 containers running simultaneously:
  Bitcoin Knots (syncing), LND (wallet ready), FileBrowser (healthy),
  Grafana, Vaultwarden, SearXNG, Home Assistant, Electrumx,
  Uptime Kuma, Jellyfin

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 16:34:57 +01:00
Dorian
cba87e2c28 fix: disk usage shows encrypted data partition, not root
Dashboard System card now reports disk usage for /var/lib/archipelago
(the LUKS encrypted partition) instead of / (small root partition).
This shows the actual usable storage (428GB) rather than the 29GB root.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 16:04:35 +01:00
Dorian
09a9dbc6ca fix: LND mainnet config, SearXNG settings seed, default caps
Some checks failed
Build Archipelago ISO (dev) / build-iso (push) Has been cancelled
- LND: add --bitcoin.active --bitcoin.mainnet and all bitcoind
  connection args as container CMD args (was only env var before)
- SearXNG: add volume mount + auto-create settings.yml on install
  (container exits immediately without it)
- Default caps: all containers get full rootless podman baseline

Tested on .198:
- Bitcoin Knots: running, syncing (942803 blocks)
- Grafana: running, migration complete
- Vaultwarden: running, keys created
- SearXNG: running, listening on 8080
- LND: needs bitcoin container named 'bitcoin-knots' on archy-net

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 15:29:24 +01:00
Dorian
9085a7e79f fix: default container caps for rootless podman reliability
Some checks failed
Build Archipelago ISO (dev) / build-iso (push) Has been cancelled
All containers now get CHOWN+FOWNER+SETUID+SETGID+DAC_OVERRIDE+NET_BIND_SERVICE
as the default cap set. Rootless podman needs these for:
- CHOWN/FOWNER/DAC_OVERRIDE: file ownership in mapped UID namespace
- SETUID/SETGID: internal user switching (entrypoint scripts)
- NET_BIND_SERVICE: port binding in network namespaces

Tested on .198: Grafana, Vaultwarden, Bitcoin Knots all start successfully.
Previously failed with "Permission denied" or "loopback adapter" errors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 15:24:28 +01:00
Dorian
d989535a9a fix: NET_BIND_SERVICE cap for Bitcoin/LND + default for all apps
Some checks failed
Build Archipelago ISO (dev) / build-iso (push) Has been cancelled
Bitcoin Knots failed to start with "failed to set loopback adapter up"
because cap-drop=ALL removed NET_BIND_SERVICE, which rootless podman
needs for network namespace setup.

- Add NET_BIND_SERVICE to Bitcoin/LND/Fedimint capabilities
- Add NET_BIND_SERVICE as default for ALL apps (rootless podman needs it)
- UID mapping fix from previous commit also included

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 15:12:40 +01:00
Dorian
20289c5bec fix: rootless podman UID mapping for container data dirs
Some checks failed
Build Archipelago ISO (dev) / build-iso (push) Has been cancelled
create_data_dirs now chowns data directories to the correct mapped
UID for rootless podman (host_uid = 100000 + container_uid).

Previously only Grafana (UID 472) was handled. Now all containers
get the correct ownership:
- Bitcoin Knots: 100101 (container UID 101)
- Grafana: 100472 (UID 472)
- LND: 101000 (UID 1000)
- MariaDB: 100999 (UID 999)
- Postgres: 100070 (UID 70)
- All others: 100000 (UID 0, root)

Without this, containers fail with "Operation not permitted" on
chown during startup because rootless podman restricts UID operations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 14:48:37 +01:00
Dorian
7f03e39f58 feat: onboarding polish, splash screen, controller nav, dev script
Some checks failed
Build Archipelago ISO / build-iso (push) Has been cancelled
Build Archipelago ISO (dev) / build-iso (push) Failing after 45m15s
Onboarding flow:
- Intro: improved layout and transitions
- DID: better card styling and responsiveness
- Path: added visual enhancements
- Backup/Identity/Verify: streamlined markup
- SplashScreen component added

UI:
- Controller navigation improvements (useControllerNav)
- Style.css refinements

Backend:
- Runtime package fix

Dev:
- dev-start.sh improvements

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 13:41:52 +00:00
Dorian
2021de5cda fix: auto-create default user, force reboot, i915 firmware, first boot info
Some checks failed
Build Archipelago ISO / build-iso (push) Has been cancelled
Critical fixes from ISO testing on .198:
- Backend auto-creates default user (password123) on first start
  so login works immediately after onboarding
- Force reboot (reboot -f) after install to avoid SquashFS errors
  when live USB is removed
- Eject USB before prompting user, not after
- Add firmware-misc-nonfree for Intel i915 GPU (suppresses dozens
  of "Possible missing firmware" warnings during initramfs update)
- First boot screen: wait up to 10s for DHCP before showing IP
- First boot screen: compact layout fits 80-col terminals
- ISOLINUX menu resolution dropped to 640x480 for universal
  VESA compatibility (was 1024x768, caused scaling on some hardware)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 13:06:34 +00:00
Dorian
9db55b0b34 feat: container orchestration, branding overhaul, onboarding logging
Some checks failed
Build Archipelago ISO / build-iso (push) Failing after 34m59s
Container orchestration:
- Health monitor with crash recovery and auto-restart
- Doctor service (periodic health checks via systemd timer)
- Reconcile service (desired-state convergence)
- Stack-aware install/uninstall with dependency tracking

Branding:
- Custom GRUB background (designer artwork, 1024x768)
- ISOLINUX boot menu: centered, orange accents, clean labels
- Terminal banners: adaptive width, basic ANSI colors, fits 80-col
- Removed auto-generated splash scripts (designer provides assets)
- GRUB theme: lowercase branding

Frontend:
- 401 handler clears localStorage immediately (prevents cascade)

Backend:
- Onboarding/auth logging ([onboarding] tag in journalctl)
- Cookie Secure flag logging for debugging HTTP/HTTPS issues

ISO fixes:
- Install log saved before unmount (was silently failing)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:34:29 +00:00
Dorian
782a4a62d5 fix: cookies Secure flag based on X-Forwarded-Proto, not dev_mode
Secure flag on session cookies broke HTTP LAN access — browsers refuse
to send Secure cookies over plain HTTP, causing 401 redirect loop.

Fix: check X-Forwarded-Proto header. Only set Secure when request came
over HTTPS. HTTP on LAN works, HTTPS still gets Secure cookies.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:34:29 +00:00
Dorian
39857c775a fix: onboarding auth, stale CI build, autocomplete attrs
Some checks failed
Build Archipelago ISO / build-iso (push) Has been cancelled
- Add identity.create + server.echo to UNAUTHENTICATED_METHODS
- Clear web/dist before frontend build to prevent stale artifacts
- Add autocomplete attrs to login inputs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 19:19:51 +00:00
Dorian
f940b4562a fix: filebrowser port bind, CSRF in tests, console-setup, auto-test scope
All checks were successful
Build Archipelago ISO / build-iso (push) Successful in 18m35s
FileBrowser crash fix:
- Add --cap-add=NET_BIND_SERVICE (port 80 needs it with --cap-drop=ALL)
- Add --cap-add=DAC_OVERRIDE for rootless volume access
- Both in first-boot script and backend config.rs

Test script fixes:
- Extract csrf_token cookie and send as X-CSRF-Token header on RPC calls
- Add --phase1-only flag for safe install-only checks (no side effects)
- Auto-test service uses --phase1-only so it doesn't steal onboarding

Install fixes:
- Pre-create ~/.local/share/containers (ReadWritePaths mount namespace error)
- Fix console-setup.service: add After=tmp.mount + ExecStartPre mkdir /tmp

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 17:17:18 +00:00
Dorian
127a36c5c8 fix: production onboarding, CI tests, container security, keyboard nav
Some checks failed
Build Archipelago ISO / build-iso (push) Has been cancelled
Install & Onboarding:
- Remove DEV_MODE=true from production ISO service file (auto-created
  users, skipped password setup)
- Auto-install no longer overwrites rootfs service file with bad template
- Login.vue always checks auth.isSetup — shows password creation form
  on fresh install without requiring dev build flag
- Deploy image-versions.sh to /opt/archipelago/scripts/ on installed nodes
- First-boot-containers sources image-versions.sh, runs podman as
  archipelago user (rootless), enables linger + podman.socket
- Correct volume ownership (100000:100000 for rootless UID mapping)

Container Security:
- FileBrowser: add --cap-add=DAC_OVERRIDE for rootless podman volume access
- FileBrowser: add --read-only, /data volume for database, proper cmd args
- First-boot script matches backend config (security hardening + health check)

CI Pipeline:
- Add vue-tsc type check + vitest run to build-iso.yml (runs every push)
- Add post-install-tests.yml workflow (workflow_dispatch, SSH to target)
- Build report: set +eo pipefail, fix rootfs path, add || true guards
- Bundle run-post-install-tests.sh into ISO

E2E Test Suite (scripts/run-post-install-tests.sh):
- Phase 1: Install verification (files, services, podman, linger, DEV_MODE check)
- Phase 2: Onboarding flow (auth.isSetup, auth.setup, login, DID, complete)
- Phase 3: Container lifecycle (install 3 apps via package.install RPC,
  verify running, stop, verify stopped, restart, verify running, health)
- Phase 4: Log verification (first-boot log, diagnostics, journal errors)
- Correct package.install params: {"id", "dockerImage"}

Frontend:
- Fix backdrop-filter tab-switch bug (keep animations paused during rebuild)
- Dashboard glitch animations paused during tab-hidden
- Gamepad nav: auto-focus first container on route change
- Tab roving: Left/Right on role="tab" cycles and activates sibling tabs
- ContainerApps: data-controller-launch on running app cards
- 515 tests passing (fixed 30 broken, added 19 new keyboard nav tests)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 16:16:57 +00:00
Dorian
320c9f5b19 fix: container install flow, filebrowser auth, AppCard enrichment
Some checks failed
Build Archipelago ISO / build-iso (push) Has been cancelled
- Fix .198-style fresh installs: systemd service ExecStartPre creates
  /run/user/1000, enable podman.socket, chmod 644 /etc/hosts
- Filebrowser: add /data volume for database (fixes read-only crash),
  secure auth with random password via backend RPC (no more admin/admin)
- AppCard: enrich installing state with marketplace metadata (icon,
  title, description, tier badge, author, version)
- Registry: btcpayserver 1.13.5 → 1.13.7, images mirrored
- ReadWritePaths: add home container paths for rootless podman

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 13:32:54 +00:00
Dorian
f7a57b8f1f chore: remove dead core/parmanode crate
The parmanode compatibility layer was scaffolded but never wired up —
zero imports or calls from anywhere in the codebase. Closes gitea#1.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 15:33:13 +00:00
Dorian
ae13c0dad2 feat: migrate all container images to Archipelago app registry
Some checks failed
Build Archipelago ISO / build-iso (push) Failing after 0s
All container image references now pull from 80.71.235.15:3000/archipelago/
instead of Docker Hub and ghcr.io. image-versions.sh is the single source
of truth; all scripts use $*_IMAGE variables instead of hardcoded refs.

Files updated:
- scripts/image-versions.sh: central ARCHY_REGISTRY variable
- core/*/config.rs: registry whitelist includes app registry
- core/*/stacks.rs: Immich + Penpot stack images
- scripts/{first-boot,deploy-to-target,container-specs}.sh: use variables
- docker/*/Dockerfile: nginx base image from registry
- image-recipe/: ISO build, podman config, menu script
- scripts/{container-doctor,deploy-bitcoin-knots,fix-indeedhub,validate-app-manifest}.sh

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 14:06:21 +00:00
Dorian
08bb2c80d4 feat: LUKS2 encryption, boot sequence fixes, onboarding auth, CI/CD
Some checks failed
Build Archipelago ISO / build-iso (push) Has been cancelled
- LUKS2 full-partition encryption for /var/lib/archipelago/ (TASK-42)
  4-partition layout: BIOS + EFI + root (30GB) + encrypted data
  AES-256-XTS with AES-NI detection, ChaCha20 fallback for ARM
  Auto-unlock via crypttab + random key file

- Fix EFI boot errors: remove shim-signed, clean shim artifacts
- Fix first-boot sequence: always show boot animation before onboarding
- Fix stale localStorage causing login instead of onboarding (BUG-47)

- Add auth.setup + auth.isSetup RPC handlers for password on clean install
- Add onboarding methods to UNAUTHENTICATED_METHODS (DID sign 403 fix)

- FileBrowser bundled in unbundled ISO, fix auto-login Secure cookie (BUG-46)
- Kiosk mode: xorg/chromium in rootfs, toggle script, MOTD instructions

- Add Gitea Actions CI/CD workflow for automatic ISO builds

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 09:12:16 +00:00
Dorian
0e0c97c203 feat: architecture review fixes, self-update system, CI pipeline, supply chain hardening
Architecture review (all P0+P1 issues now fixed):
- Add 10s timeout to 6 bare Nostr client.connect() calls
- Pin all 12 crypto deps to exact versions from Cargo.lock
- Pin all 15 floating container image tags to exact patch versions
- Add CI pipeline (cargo fmt + clippy + tests, frontend type-check + build)

Self-update system (git.tx1138.com):
- scripts/self-update.sh: pull, build, install, restart with rollback
- systemd timer checks daily at 3 AM
- update.check RPC does git-based checks when repo is present
- update.git-apply RPC triggers self-update from UI
- Default update URL changed from GitHub to git.tx1138.com
- Git added to ISO package list for fresh installs

Documentation:
- CHANGELOG v1.3.1 with all changes
- README updated (version, update system section)
- BETA-PROGRESS session #6 logged
- architecture-review.html: 4 issues marked FIXED, 8/12 refactoring done

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 15:52:26 +00:00
Dorian
13e4a738be bug fixing and deploy and build diagnostics 2026-03-22 03:30:21 +00:00
Dorian
77f550fb5e refactor: split package.rs, mod.rs, listener.rs, and lnd.rs into focused submodules
- R35: Split package.rs (1794 lines) into package/{mod,config,validation,lifecycle}.rs
- R36: Split mesh/listener.rs (1799 lines) into listener/{mod,session,frames,decode,dispatch,bitcoin}.rs
- R37: Split rpc/mod.rs into mod.rs + dispatcher.rs, middleware.rs, response.rs (54% reduction)
- R38: Split lnd.rs (1064 lines) into lnd/{mod,info,channels,wallet,payments}.rs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 02:26:28 +00:00
Dorian
f3976ba03a refactor: centralize constants, eliminate unwraps, remove dead code, resolve TODOs
- R13+R16: Replace .expect() with .context()? in main.rs and identity.rs
- R17+R18+R19: Fix unwrap() calls in helpers and js-engine
- R20+R21: Remove #[allow(dead_code)] annotations and delete truly dead code
- R22-R26: Create constants.rs module, replace 21 hardcoded values across 12 files
- R28+R29: LND/DWN timeouts already present — verified
- R30-R33: Remove TODO comments, implement marketplace payment check

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 01:54:35 +00:00
Dorian
5c3a3ffa8e fix: systemd resource limits, Tor rotation transition, unwrap elimination, RPC timeouts
- I2: Add MemoryMax=4G, LimitNOFILE=65535, TasksMax=2048 to systemd service
- I3: Tor rotation keeps old service for 1h transition before cleanup
- R14: Replace .parse().unwrap() with .unwrap_or(localhost) in rate limiter
- R15: Replace 7 unwrap/expect in mesh protocol with proper error propagation
- R27: Add 10s timeouts to mesh Bitcoin RPC calls

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 01:46:40 +00:00
Dorian
4d17c60da7 refactor: replace blocking std::fs and TCP I/O with async tokio equivalents
- R6: Convert 6 std::fs calls in session.rs to tokio::fs async
- R7: Convert std::fs::read_to_string in docker_packages.rs to async
- R8: Convert 3 std::fs calls in port_allocator.rs to async, switch to tokio::sync::Mutex
- R9+R10+R11: Fix blocking I/O in node_message.rs and nostr_discovery.rs
- R12: Convert electrs_status.rs from sync TCP to async tokio::net with 5s timeouts
- R4+R5: Spawn periodic cleanup tasks for endpoint and login rate limiters

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 01:21:08 +00:00
Dorian
c299199d37 fix: add health RPC handler, Nostr connect timeouts, atomic backup restore, nginx rate limits
- R1: Add health RPC endpoint with crash recovery status, uptime, and version
- R2: Wrap all 5 Nostr client.connect() calls in 10s timeout
- R3: Make backup restore atomic with staging dir and rollback on failure
- I1: Add rate limiting, body size, and proxy timeouts to unauthenticated nginx endpoints

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 01:02:16 +00:00
Dorian
196682f2f2 fix: LND and ElectrumX Tor onion address resolution
- lnd.rs: check tor-hostnames readable copy, then /var/lib/tor/, then
  legacy /var/lib/archipelago/tor/ with sudo fallback for each
- electrs_status.rs: same multi-path resolution for ElectrumX onion
- Both servers: created /var/lib/archipelago/tor-hostnames/ with readable
  copies of onion addresses (avoids sudo on every API call)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 12:31:30 +00:00
Dorian
b31148a8b7 fix: rpcauth credentials, reboot survival, system Tor for all containers
- Bitcoin RPC: switch to rpcauth (salted hash in bitcoin.conf, no plaintext
  in config or CLI). Password stable across reboots/restarts/deploys.
- Remove daily-reboot-test.sh cron on both servers
- Enable podman-restart.service for container auto-start after reboot
- System Tor: SocksPort 0.0.0.0:9050 with SocksPolicy for container access
- LND: tor.socks=host.containers.internal:9050 (system Tor, not container)
- Bitcoin: -proxy=host.containers.internal:9050 for Tor outbound
- bitcoin_rpc.rs: reads from secrets file, cached, stable credentials
- package.rs: dynamic rpc_user/rpc_pass, rpcauth hash generation
- network.rs: fix missing send_to_peer args (mesh encryption update)
- first-boot-containers.sh: rpcauth generation, system Tor config
- deploy-to-target.sh: rpcauth credentials, LND config migration
- Mesh: encrypted channel message support (ChaCha20-Poly1305 updates)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 11:56:20 +00:00
Dorian
b4d204d1d6 feat: reboot button in Settings with password confirmation
- system.reboot RPC endpoint requires password re-verification
- Uses systemd path unit pattern (tor-helper.sh) for privilege escalation
- 2-second delay before reboot to allow RPC response to reach client
- Clean UI: password input modal, loading state, error feedback

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 10:48:06 +00:00
Dorian
c82158c7c8 refactor: PodmanClient uses REST API socket instead of CLI
Replace all `podman` CLI shell-outs with HTTP requests to the rootless
Podman API unix socket (/run/user/{UID}/podman/podman.sock).

Benefits:
- No process spawning overhead — direct HTTP over unix socket
- Structured JSON responses — no string parsing fragility
- Proper timeouts on all operations (5s connect, 30s default, 120s create)
- Health check method to verify socket availability
- Restart container as first-class operation

Still uses CLI for:
- Image pulls (streaming operation better suited to CLI)
- Container logs (raw text stream, not JSON)

The Podman socket is rootless (runs as archipelago user), local-only
(unix socket), and already behind our session auth in the backend.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 10:13:49 +00:00
Dorian
9b6adfc42d feat: E2E encrypted Tor channel messages (ChaCha20-Poly1305)
Messages between federated nodes are now end-to-end encrypted:
- X25519 ECDH key agreement from existing ed25519 node identities
- HKDF-SHA256 key derivation with domain separation
- ChaCha20-Poly1305 authenticated encryption per message
- Random 12-byte nonce per message via OsRng (CSPRNG)
- Graceful fallback to plaintext if encryption fails
- Receiver auto-detects encrypted vs plaintext messages

The Tor transport was already encrypted (onion routing), this adds
application-layer E2E encryption so even a compromised receiving
backend can't read messages without the node's private key.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 09:04:43 +00:00
Dorian
f0a403b224 fix: persistent Tor channel messages, bulletproof Tor after deploys
- Messages persisted to disk (messages.json) — survive restarts
- Sent messages stored on backend via node-store-sent RPC
- Message deduplication (same pubkey + message within 30s)
- Max 200 messages in circular buffer
- Direction field (sent/received) for proper UI display
- Container doctor: prefer system Tor, remove archy-tor container
- Deploy torrc generator: read from tor-config/services.json,
  web apps map port 80→local port for clean .onion URLs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 08:26:40 +00:00
Dorian
fc1120338d fix: Tor management system, bug fixes, federation name sync
Major changes:
- Full Tor hidden service management via systemd path unit pattern
  (tor-helper.sh + archipelago-tor-helper.path/service) — respects
  NoNewPrivileges=yes, no sudo needed from backend
- Container doctor: prefer system Tor over container, remove archy-tor
- Deploy script: fix torrc generation (read correct services.json path),
  web apps map port 80→local port, enable both tor and tor@default
- Federation: server rename pushes name to peers via background sync
- Server name: fix root-owned file, optimistic store update
- Mesh: local echo for sent messages, sendingArch loading state
- Web5: Message button → Mesh redirect, node name lookup in messages
- PeerFiles: show DID not onion in header
- Connected Nodes: flex-1 instead of fixed max-h
- Toast notifications route to Mesh
- Deploy script: fix single-quote syntax in SSH block

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 02:59:29 +00:00
Dorian
c5417640a2 feat: Lightning channel backup, Web5 mobile tab active, file path fix
Task 14: Lightning Channel Backup
- New lnd.export-channel-backup RPC — exports SCB (Static Channel Backup)
- Settings UI: "Lightning Channel Backup" section with export + copy
- Returns base64 backup data, channel count, timestamp

Web5 mobile tab active state
- Fixed combined tab matching for Web5: includes /web5, /federation, /mesh routes
- Previously only matched /cloud and /server (wrong branch)

Content file path fix
- Allow forward slashes in filenames for subdirectories (Music/song.mp3)
- Still block .., \, null bytes, hidden files, absolute paths

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 21:47:18 +00:00
Dorian
93aaeb4abe fix: node names not DIDs, file sharing path validation, sync results
- nodeName() shows friendly "Node-XXXX" instead of truncated DID
- nodeNameFromDid() for sync results lookup
- Map labels use node names
- Content filename validation: allow / for subdirectories (Music/song.mp3)
  but still block .., \, null bytes, hidden files, absolute paths
- Increased filename max length to 512 for paths with subdirectories

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 20:35:41 +00:00
Dorian
1a138c0409 feat: DID rotation + federation peer notification (Part 3)
- node.rotate-did: generates new Ed25519 keypair, signs rotation proof
  with old key, overwrites identity files, requires password
- federation.notify-did-change: broadcasts rotation proof to all
  trusted/observer peers over Tor
- federation.peer-did-changed: receiving side verifies rotation proof
  against known pubkey before updating peer's DID
- Rate-limited: 3/600s for rotation, 5/60s for peer notification
- Signature verification uses ed25519_dalek (constant-time)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 19:27:16 +00:00
Dorian
f8794791f3 feat: DID persistence + federation node names in sync
Part 1 — DID Persistence:
- Deploy script creates /var/lib/archipelago/identity/ directory
- First-boot script creates identity dir with proper ownership
- Identity load now logs pubkey to confirm persistence across restarts

Part 2 — Node Names:
- NodeStateSnapshot includes node_name field
- build_local_state() passes server name to sync responses
- update_node_state() stores peer's announced name on the FederatedNode
- Names propagate automatically during federation.sync-state

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 19:19:13 +00:00
Dorian
f20f0650cf feat: Discover view, Fleet dashboard, MeshMap, type fixes
- New Discover.vue (app store redesign)
- Fleet.vue dashboard for .228
- MeshMap.vue component
- Fixed Discover.vue type errors (unused var, type predicate)
- Various UI updates (Apps, Dashboard, Marketplace, Mesh, Web5)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 16:12:01 +00:00