Commit Graph

543 Commits

Author SHA1 Message Date
Dorian
152281b6bb fix: CI cache Debian Live ISO to avoid 1.4GB re-download
Copy the Debian Live ISO from the server's existing build cache
into the CI workspace before running the ISO build. Saves ~10 min.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 16:03:49 +00:00
Dorian
6504a13bb8 feat: ignore lid close on laptops so server keeps running
Adds logind.conf.d drop-in to HandleLidSwitch=ignore for all
lid close scenarios (battery, external power, docked). Archipelago
nodes installed on laptops won't suspend when the lid is closed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 15:58:16 +00:00
Dorian
de336c472d chore: remove dead core/parmanode crate
The parmanode compatibility layer was scaffolded but never wired up —
zero imports or calls from anywhere in the codebase. Closes gitea#1.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 15:33:13 +00:00
Dorian
9e39614ecb fix: CI don't replace live binary, pass build path to ISO script
Remove the cp to /usr/local/bin that caused 'Text file busy'.
The ISO build script now accepts ARCHIPELAGO_BIN env var to find
the freshly built binary instead of requiring it installed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 15:28:43 +00:00
Dorian
b2168751db fix: CI rm binary before cp to avoid 'Text file busy'
On Linux, rm on a running binary works (process keeps its fd).
Then cp creates a new inode. Restart service after.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 15:26:18 +00:00
Dorian
f632c4acd8 fix: CI add debug output for frontend build step
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 15:04:22 +00:00
Dorian
e1bda755f2 fix: CI stop archipelago service before replacing binary
The running binary locks the file, causing 'Text file busy' on cp.
Stop the service, copy, then restart.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 14:44:32 +00:00
Dorian
56938eb139 fix: CI increase timeout, cleared stale git lock on runner
Stale shallow.lock was blocking checkout. Removed it on the runner.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 14:30:21 +00:00
Dorian
f2f08fba12 fix: CI use actions/checkout@v4 (Gitea proxies to GitHub)
The full URL form was 404. The short form lets Gitea resolve from
its configured action sources (GitHub proxy). This worked for build #7.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 14:26:57 +00:00
Dorian
525779f4aa fix: CI checkout cd to home before cleanup to avoid cwd error
The runner cwd is the workspace itself, so deleting it removes the
shell's cwd. cd to home first, then clean workspace before clone.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 14:24:24 +00:00
Dorian
30cc2378d2 fix: CI checkout with token auth for private repo
Manual git clone needs GITHUB_TOKEN injected for private repos.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 14:21:48 +00:00
Dorian
5a850275b7 fix: CI checkout uses manual git clone instead of missing action
The actions/checkout@v4 action was 404 on git.tx1138.com causing
instant build failures. Use manual git clone for reliability with
host-mode runner.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 14:16:13 +00:00
Dorian
35c8420095 feat: migrate all container images to Archipelago app registry
All container image references now pull from 80.71.235.15:3000/archipelago/
instead of Docker Hub and ghcr.io. image-versions.sh is the single source
of truth; all scripts use $*_IMAGE variables instead of hardcoded refs.

Files updated:
- scripts/image-versions.sh: central ARCHY_REGISTRY variable
- core/*/config.rs: registry whitelist includes app registry
- core/*/stacks.rs: Immich + Penpot stack images
- scripts/{first-boot,deploy-to-target,container-specs}.sh: use variables
- docker/*/Dockerfile: nginx base image from registry
- image-recipe/: ISO build, podman config, menu script
- scripts/{container-doctor,deploy-bitcoin-knots,fix-indeedhub,validate-app-manifest}.sh

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 14:06:21 +00:00
Dorian
18cd0cf387 feat: switch marketplace to Archipelago app registry
All app images now pull from 80.71.235.15:3000/archipelago/
instead of Docker Hub / ghcr.io. Insecure registry config
baked into ISO for fresh installs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 10:46:26 +00:00
Dorian
2e0b78ef42 fix: CI workflow use Gitea checkout action, unbundled only
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 09:34:11 +00:00
Dorian
81a38d6824 chore: CI builds unbundled ISO only (with FileBrowser)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 09:26:40 +00:00
Dorian
ed4a5470f9 chore: remove disabled workflows, keep only build-iso
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 09:20:12 +00:00
Dorian
b781136c34 feat: CI/CD builds both bundled and unbundled ISOs
Workflow builds both variants on push to main. Manual trigger
lets you choose bundled, unbundled, or both. ISOs auto-copied
to FileBrowser /Builds/ folder for easy download.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 09:13:31 +00:00
Dorian
b7e60af823 feat: LUKS2 encryption, boot sequence fixes, onboarding auth, CI/CD
- LUKS2 full-partition encryption for /var/lib/archipelago/ (TASK-42)
  4-partition layout: BIOS + EFI + root (30GB) + encrypted data
  AES-256-XTS with AES-NI detection, ChaCha20 fallback for ARM
  Auto-unlock via crypttab + random key file

- Fix EFI boot errors: remove shim-signed, clean shim artifacts
- Fix first-boot sequence: always show boot animation before onboarding
- Fix stale localStorage causing login instead of onboarding (BUG-47)

- Add auth.setup + auth.isSetup RPC handlers for password on clean install
- Add onboarding methods to UNAUTHENTICATED_METHODS (DID sign 403 fix)

- FileBrowser bundled in unbundled ISO, fix auto-login Secure cookie (BUG-46)
- Kiosk mode: xorg/chromium in rootfs, toggle script, MOTD instructions

- Add Gitea Actions CI/CD workflow for automatic ISO builds

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 09:12:16 +00:00
Dorian
f90b407054 fix: add --no-cache to rootfs Docker build to prevent stale layer caching
Podman was caching the rootfs Docker layers, meaning firmware packages
and sources.list changes were never picked up on rebuild. Force fresh
build every time since the rootfs tar is the real cache.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 21:31:51 +00:00
Dorian
6813e6d506 fix: replace DEB822 sources with traditional sources.list for non-free-firmware
The sed commands to modify debian.sources DEB822 format were silently
failing — firmware packages never got installed. Replace the entire
sources config with traditional sources.list that explicitly includes
non-free-firmware component.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 21:21:27 +00:00
Dorian
57f97b9351 fix: remove Secure Boot shim chain — causes EFI boot failure on most hardware
The shim (shimx64.efi.signed) was being installed as BOOTX64.EFI but it
tries to load a second-stage binary with a garbled name, causing
"Failed to open \EFI\BOOT\" errors on machines with Secure Boot disabled.

Fix: use grub-install --removable directly (unsigned GRUB as BOOTX64.EFI).
This works on all UEFI hardware. Users with Secure Boot must disable it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 20:47:14 +00:00
Dorian
17924c73d7 fix: EFI Secure Boot chain with grub.cfg, fix non-free-firmware repo
EFI boot fix:
- Shim needs grub.cfg in same directory to find the root partition
- Create minimal grub.cfg in /EFI/BOOT/ that chains to /boot/grub/grub.cfg
- Preserve unsigned GRUB as fallback for non-Secure-Boot systems
- Copy full chain to both /EFI/BOOT/ and /EFI/archipelago/ paths
- Log EFI directory contents for debugging

Firmware fix:
- DEB822 format sed was wrong — fix Components line replacement
- Add fallback sources.list entry to guarantee non-free-firmware repo
- Ensures firmware-realtek, intel-microcode actually get installed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 19:25:55 +00:00
Dorian
ec32b336a6 fix: zero BIOS boot partition to prevent FAT-fs errors, add CPU microcode
- dd zero the 1MB BIOS boot partition before formatting to prevent
  kernel FAT-fs bread() errors during boot (sda1 had stale data)
- Add intel-microcode and amd64-microcode packages to suppress
  TSC_DEADLINE and similar CPU firmware bug warnings on boot

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 18:25:01 +00:00
Dorian
273de288c3 fix: move mobile nav outside main for fixed positioning, add container scripts
- Dashboard.vue: move DashboardMobileNav outside <main> so position:fixed
  isn't broken by will-change:transform on the perspective container
- Add container-specs.sh and reconcile-containers.sh utility scripts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 18:13:22 +00:00
Dorian
b0ac506899 fix: robust ISO download detection, fix color escape codes in installer
- Use find instead of hardcoded filename for downloaded ISO detection
  (wget may save with redirect filename or partial name)
- Fix color escape codes: use $'\033' syntax instead of '\033' for
  reliable ANSI color rendering in installer output

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 17:03:21 +00:00
Dorian
e88b759b52 fix: add hardware firmware, suppress GRUB warning, eject USB after install
- Add firmware-realtek, firmware-iwlwifi, firmware-misc-nonfree to rootfs
  (fixes missing r8169 NIC firmware on Dell and other common hardware)
- Enable non-free-firmware repo in rootfs Dockerfile
- Suppress os-prober GRUB warning (GRUB_DISABLE_OS_PROBER=true)
- Auto-eject USB boot media before reboot to prevent re-entering installer

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 16:56:02 +00:00
Dorian
5918185a3a fix: use Debian 12 (Bookworm) live ISO base, remove squashfs boot artifacts
The ISO build was using Debian 13 (Trixie) as the live installer base
while the rootfs was built from Debian 12 (Bookworm). This caused:
- Debian 13 kernel/hostname/user in the live environment
- Squashfs errors on reboot from live-boot initramfs hooks

Fixes:
- Pin live ISO to Debian 12.10.0 (archive URL)
- Remove live-boot/live-config packages before initramfs regeneration
- Clean out any live-boot initramfs hooks/scripts from installed rootfs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 16:51:14 +00:00
Dorian
207e53144c feat: architecture review fixes, self-update system, CI pipeline, supply chain hardening
Architecture review (all P0+P1 issues now fixed):
- Add 10s timeout to 6 bare Nostr client.connect() calls
- Pin all 12 crypto deps to exact versions from Cargo.lock
- Pin all 15 floating container image tags to exact patch versions
- Add CI pipeline (cargo fmt + clippy + tests, frontend type-check + build)

Self-update system (git.tx1138.com):
- scripts/self-update.sh: pull, build, install, restart with rollback
- systemd timer checks daily at 3 AM
- update.check RPC does git-based checks when repo is present
- update.git-apply RPC triggers self-update from UI
- Default update URL changed from GitHub to git.tx1138.com
- Git added to ISO package list for fresh installs

Documentation:
- CHANGELOG v1.3.1 with all changes
- README updated (version, update system section)
- BETA-PROGRESS session #6 logged
- architecture-review.html: 4 issues marked FIXED, 8/12 refactoring done

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 15:52:26 +00:00
Dorian
4d1df4a319 docs: update deploy session memory with session 3 fixes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 18:06:57 +00:00
Dorian
a802b2e478 fix: correct health check endpoints for fedimint, nextcloud, filebrowser
- Fedimint: check port 8175 (UI) not 8174 (websocket API)
- Nextcloud: check / not /status.php (returns 302 during setup)
- FileBrowser: check / not /health (endpoint doesn't exist)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 00:47:49 +00:00
Dorian
de07b48876 fix: health check escaping for SSH heredoc context
- Remove || exit 1 from health-cmd (redundant, breaks SSH heredoc)
- Use --health-cmd 'cmd' format (space, not equals) for proper quoting
- Simplify bitcoin health check to bitcoin-cli getnetworkinfo (no creds needed)
- Fix MariaDB health check nested quote issue

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 23:45:32 +00:00
Dorian
47c42d4d1b fix: LND config escaping in SSH heredoc, Tailscale fallback for build source
- Fix shell escaping in LND config sync block (single-quoted SSH context
  doesn't need backslash-escaped dollars)
- deploy-tailscale.sh BUILD_SOURCE auto-detects Tailscale IP when LAN
  unreachable (fixes "No binary on .228" error)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 17:01:02 +00:00
Dorian
ed9b63fa72 fix: suppress tar xattr spam in AIUI deploy
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 16:54:54 +00:00
Dorian
4441473f2f fix: fleet deploy falls back to Tailscale when LAN unreachable
- Add --all as alias for --fleet
- Fleet deploy auto-detects Tailscale IP when LAN SSH fails
- Skip .198 gracefully when unreachable instead of failing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 16:51:49 +00:00
Dorian
c2ac572d9a fix: deploy credential sync, health checks, rootless port binding
- LND config always synced with secrets/bitcoin-rpc-password before
  starting (both deploy scripts) — fixes 401 auth errors on all nodes
- Replace eval "$DB_PASSWORDS" with safe individual SSH reads in
  deploy-tailscale.sh (eliminates command injection risk)
- Add MariaDB password sync step after container start (ALTER USER)
- Add --health-cmd to all 25 containers in deploy-tailscale.sh
- FileBrowser uses --user 0:0 for rootless port 80 binding (both scripts)
- Fedimint env var fixed: FM_REL_NOTES_ACK=0_4_xyz

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 14:16:11 +00:00
Dorian
e4e0ef4f11 bug fixing and deploy and build diagnostics 2026-03-22 03:30:21 +00:00
Dorian
1f8287c4c3 docs: mark all overnight plan tasks complete
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 03:08:52 +00:00
Dorian
b2bb5f319e feat: add E2E smoke test script and CI/CD pipeline plan
- Create scripts/smoke-test.sh for live server verification (7 checks)
- Document planned GitHub Actions CI/CD pipeline in docs/ci-cd-plan.md
- Integration tests deferred to future task (require test harness setup)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 03:08:00 +00:00
Dorian
23d67c0672 refactor: create shared script library, fix ISO image pinning, document planned splits
- S21: Create scripts/lib/common.sh with shared logging, SSH, health check, mem_limit functions
- S18: Source common.sh from deploy-to-target.sh, deploy-tailscale.sh, first-boot-containers.sh
- S16: Fix 2 hardcoded images in ISO build, add missing image variables
- S19: Document planned 7-module split of build-auto-installer-iso.sh
- S20: Document planned 8-module split of first-boot-containers.sh

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 03:06:29 +00:00
Dorian
69e25410b0 refactor: split Marketplace, Server, Home, AppDetails views; minor frontend quality fixes
- F29-F32: Split 4 views (Marketplace 1293→505, Server 1132→486, Home 1059→394, AppDetails 1036→386)
- F20: Add aria-current="page" to Dashboard nav links
- F21: Add 150ms search debounce in Marketplace and Apps views
- F22: Reduce backdrop-filter blur to 8px on mobile for GPU performance
- F23: Track and clear WebSocket connect check interval in all paths

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 03:01:38 +00:00
Dorian
afd7405b1a refactor: split Web5.vue, Settings.vue, and Mesh.vue into focused subcomponents
- F25: Split Web5.vue (3940 lines) into 14 files under views/web5/
- F26: Split Mesh.vue (2106→840 lines) extracting Bitcoin and Deadman panels
- F27: Dashboard.vue assessed — layout shell, no split needed
- F28: Split Settings.vue (1792 lines) into AccountSection + SystemSection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 02:43:28 +00:00
Dorian
618244eab0 refactor: split package.rs, mod.rs, listener.rs, and lnd.rs into focused submodules
- R35: Split package.rs (1794 lines) into package/{mod,config,validation,lifecycle}.rs
- R36: Split mesh/listener.rs (1799 lines) into listener/{mod,session,frames,decode,dispatch,bitcoin}.rs
- R37: Split rpc/mod.rs into mod.rs + dispatcher.rs, middleware.rs, response.rs (54% reduction)
- R38: Split lnd.rs (1064 lines) into lnd/{mod,info,channels,wallet,payments}.rs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 02:26:28 +00:00
Dorian
d8b753e1e4 fix: deploy error visibility, trap cleanup, variable quoting, frontend resilience
- S10: Add warnings to silent health check failures in deploy scripts
- S11: Add trap cleanup for temp dirs in deploy and tailscale scripts
- S12: Quote 20+ critical unquoted variables across deploy scripts
- S13: Extract hardcoded IPs to deploy-config-defaults.sh
- S15: Add --memory=256m to UI container runs
- F16: Remove in-memory JWT, use cookie-only auth in filebrowser client
- F17: Add meta tag fallback for CSRF token in RPC client
- F19: Track and clear setTimeout in AppSession on unmount

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 02:06:08 +00:00
Dorian
8e38342d53 fix: WebSocket reconnect race, parse error tracking, RPC timeout reduction, vendor chunk split
- F8: Add isReconnecting flag to prevent parallel reconnection attempts
- F9: Track JSON parse errors, force reconnect after 3 consecutive failures
- F11: Reduce RPC timeout to 15s, add jitter to retry backoff
- F12: Add vendor chunk splitting for vue/router/pinia
- F13: DOMPurify already applied to QR SVGs — verified
- F14: Replace O(n) goals alias lookup with Map-based O(1)
- F15: Wrap 7 localStorage.setItem calls in try/catch across 5 stores

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 01:57:05 +00:00
Dorian
94f2de4a64 refactor: centralize constants, eliminate unwraps, remove dead code, resolve TODOs
- R13+R16: Replace .expect() with .context()? in main.rs and identity.rs
- R17+R18+R19: Fix unwrap() calls in helpers and js-engine
- R20+R21: Remove #[allow(dead_code)] annotations and delete truly dead code
- R22-R26: Create constants.rs module, replace 21 hardcoded values across 12 files
- R28+R29: LND/DWN timeouts already present — verified
- R30-R33: Remove TODO comments, implement marketplace payment check

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 01:54:35 +00:00
Dorian
c3d4a7063b fix: systemd resource limits, Tor rotation transition, unwrap elimination, RPC timeouts
- I2: Add MemoryMax=4G, LimitNOFILE=65535, TasksMax=2048 to systemd service
- I3: Tor rotation keeps old service for 1h transition before cleanup
- R14: Replace .parse().unwrap() with .unwrap_or(localhost) in rate limiter
- R15: Replace 7 unwrap/expect in mesh protocol with proper error propagation
- R27: Add 10s timeouts to mesh Bitcoin RPC calls

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 01:46:40 +00:00
Dorian
95fbb094b0 fix: deploy locking, safe eval replacement, first-boot error handling, script hardening
- S4: Add Bitcoin readiness gate and container tracking with final summary
- S5: Replace eval "$DB_PASSWORDS" with safe case-based variable parsing
- S6: Add deploy locking with stale lock detection (30min timeout)
- S7: Deploy rollback already implemented — verified existing mechanism
- S8: Switch trust-archipelago-cert.sh to SSH key auth, sshpass as fallback
- S9: Pipe MariaDB SQL via stdin to avoid password in ps output
- S17: Add disk space pre-flight check (abort if >85% full)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 01:39:22 +00:00
Dorian
0cca539a0f fix: WebSocket reconnect state refresh, listener leak fixes, pin container images
- F4: Fetch fresh server state after WebSocket reconnect
- F5: Guard message polling timer with auth check, stop on logout
- F6: Remove NIP-07 listener in appLauncher close()
- F7: Initialize audio player once to prevent listener stacking
- S3: Pin all container images to specific versions, create image-versions.sh

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 01:32:28 +00:00
Dorian
2443ae6bba refactor: replace blocking std::fs and TCP I/O with async tokio equivalents
- R6: Convert 6 std::fs calls in session.rs to tokio::fs async
- R7: Convert std::fs::read_to_string in docker_packages.rs to async
- R8: Convert 3 std::fs calls in port_allocator.rs to async, switch to tokio::sync::Mutex
- R9+R10+R11: Fix blocking I/O in node_message.rs and nostr_discovery.rs
- R12: Convert electrs_status.rs from sync TCP to async tokio::net with 5s timeouts
- R4+R5: Spawn periodic cleanup tasks for endpoint and login rate limiters

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 01:21:08 +00:00