docs: split Step 8 into 8a/8b/8c

Discovered during Step 8 execution that first-boot-containers.sh
creates 30+ containers with per-container logic (wallet loads, DB
init, rpcauth derivations, post-create health waits) and does
substantial non-container setup (secret gen, rootless-podman subuid
chowns, Tor hostnames, WireGuard, firewall, nostr-relay). Only 3 of
the 30+ containers have manifests today (the UIs from Step 7).

Deleting the bash in a single step bricks first-boot on fresh
installs. Split into:

- 8a: delete reconcile-containers.sh + container-specs.sh + reconcile
  systemd unit + timer. BootReconciler fully covers these. Safe,
  atomic, no manifest porting required.
- 8b: port remaining ~25 containers into apps/<id>/manifest.yml. One
  manifest per commit, validated against current bash behavior.
  Multi-day scope.
- 8c: rename first-boot-containers.sh -> first-boot-setup.sh, strip
  container ops, keep secret/dir/Tor/WG/firewall setup. Final
  one-way door, requires 8b complete.
This commit is contained in:
archipelago
2026-04-23 02:34:43 -04:00
parent 758d3e47d8
commit 236a2dee85
2 changed files with 29 additions and 16 deletions

View File

@@ -15,7 +15,9 @@ Working through the 11-step plan in [`rust-orchestrator-migration.md`](./rust-or
- [x] **Step 5**`fc39b04b` BootReconciler with Arc<Notify> shutdown, 4 paused-time tests pass
- [x] **Step 6** — main.rs wire-up: construct orchestrator once, load_manifests + adopt_existing + spawn BootReconciler, thread through Server::new / ApiHandler::new / RpcHandler::new, wire shutdown Notify to SIGTERM/SIGINT. Clean `cargo check -p archipelago` (6 pre-existing warnings), container tests 43/44 pass (the one failing `test_parse_image_versions` is pre-existing and unrelated — asserts `!contains_key("NOT_AN_IMAGE")` but the retain on line 106 keeps anything ending in `_IMAGE`).
- [x] **Step 7**`069bc4a5` bitcoin-ui pre-start hook renders nginx.conf from embedded template. New `container::bitcoin_ui` module (render fn, atomic tmp+rename, idempotent byte-compare, 8 unit tests). `ProdContainerOrchestrator::run_pre_start_hooks` fires in `install_fresh` before `create_container` and in `ensure_running` (Running+Rewritten → restart; Stopped → re-render+start). bitcoin-ui Dockerfile no longer COPYs nginx conf; arrives via runtime bind-mount (safe-failure → 404 if missing, never stale auth). `apps/{bitcoin,electrs,lnd}-ui/manifest.yml` land. Integration test asserts `install("bitcoin-ui")` writes substituted config to disk. 39/39 container:: tests pass (same 1 pre-existing failure).
- [ ] **Step 8** — Delete bash scripts + systemd units + ISO builder lines, **plus** add ISO builder lines to copy `apps/*/manifest.yml``/opt/archipelago/apps/` on install — next up
- [ ] **Step 8a** — Delete `reconcile-containers.sh` + `container-specs.sh` + `archipelago-reconcile.{service,timer}` + ISO builder touchpoints. Safe, `BootReconciler` fully replaces. Next up.
- [ ] **Step 8b** — Port remaining ~25 container creations from `first-boot-containers.sh` into `apps/<id>/manifest.yml` (deferred, multi-day work)
- [ ] **Step 8c** — Rename `first-boot-containers.sh``first-boot-setup.sh`, strip container ops, keep setup. Add ISO lines to copy `apps/` (final one-way door, requires 8b complete)
- [ ] **Step 9** — Hot-swap + verify on .228
- [ ] **Step 10** — Hot-swap + verify on .116
- [ ] **Step 11** — Chaos matrix on both nodes
@@ -53,25 +55,33 @@ Both are development alpha nodes — **full destructive latitude**, no need to a
## Next action
**Step 8 — Delete bash scripts + systemd units, and teach the ISO builder to install manifests.**
**Step 8a — Delete the reconcile bash path.** Safe, isolated, atomic.
Files to delete (the scripts the Rust orchestrator has now replaced):
Files to delete:
1. `scripts/reconcile-containers.sh` (531 LOC — `BootReconciler` fully replaces)
2. `scripts/container-specs.sh` (602 LOC — manifest-driven now)
3. `image-recipe/configs/archipelago-reconcile.service`
4. `image-recipe/configs/archipelago-reconcile.timer`
1. `scripts/first-boot-containers.sh` (1392 lines — the sed/envsubst path, now covered by bitcoin-ui pre-start hook + `install_fresh`)
2. `scripts/reconcile-containers.sh` (now covered by `BootReconciler` in-process loop)
3. `scripts/container-specs.sh` (manifests live in `apps/*/manifest.yml` instead)
4. `image-recipe/configs/archipelago-first-boot-containers.service` (systemd unit)
5. `image-recipe/configs/archipelago-reconcile.service` (systemd unit)
ISO builder edits in `image-recipe/build-auto-installer-iso.sh`:
- L412-413: drop `COPY archipelago-reconcile.{service,timer}`
- L429-430: drop `COPY reconcile-containers.sh` + `container-specs.sh`
- L453: drop `systemctl enable archipelago-reconcile.timer`
- L547-548: drop the `cp archipelago-reconcile.{service,timer}` block
- L550: drop `reconcile-containers.sh container-specs.sh` from the loop
Enablement lines to remove in `image-recipe/build-auto-installer-iso.sh`:
- line ~2773 (first-boot unit enable)
- line ~2896 (reconcile unit enable)
- line ~2961 (any remaining enablement hooks — verify)
No Rust changes. Atomic single commit. Full ISO build test on .116 before commit per user ask.
**New requirement** (discovered this session, not in original spec):
- Add ISO builder lines to copy `apps/*/manifest.yml``/opt/archipelago/apps/` on install. Without this, `load_manifests()` finds no manifests on fresh nodes and the orchestrator has nothing to reconcile. Reference: `image-recipe/build-auto-installer-iso.sh:2350-2351` (existing `cp -r /docker/* /mnt/target/opt/archipelago/docker/` pattern — mirror it for `/apps/`).
**Step 8b/8c come later** — they require porting 25+ container creations from `first-boot-containers.sh` into `apps/*/manifest.yml`, which is a multi-day scope. Not tonight.
No Rust code changes in this step. Atomic commit: deletions + ISO builder edits together.
---
### Why Step 8 got split (discovered 2026-04-23)
Original plan was one commit "delete bash + edit ISO builder". But on investigation:
- `first-boot-containers.sh` creates **30+ containers** with per-container logic (wallets, DB init, rpcauth derivations, post-create health waits). The repo only has manifests for 3 (bitcoin-ui, electrs-ui, lnd-ui from Step 7). Deleting bash now = brick first-boot on fresh installs.
- Script also does non-container setup: secret generation (RPC pw, DB pw, FileBrowser admin pw), UID-mapping chowns for rootless podman subuid, Tor hostnames dir, WireGuard, firewall rules, nostr-relay dir. None of this lives in the Rust orchestrator.
- Design doc §505 updated to split 8 → 8a/8b/8c. Only 8a is safe to execute before we port manifests.
---

View File

@@ -502,7 +502,10 @@ Chaos matrix (bash + Playwright, the original goal):
5. **BootReconciler**: task spawner with loop + cancellation. ~80 LOC + unit tests.
6. **main.rs wire-up**: adopt + spawn reconciler. ~20 LOC.
7. **3 UI manifests + Dockerfile BITCOIN_RPC_AUTH refactor** (use ARG + template file, not sed). ~60 lines of YAML + ~20 lines of Dockerfile.
8. **Remove bash scripts + services**: `git rm` + ISO-builder edits + changelog.
8. **Remove bash scripts + services**: split into sub-steps because `first-boot-containers.sh` creates 25+ containers (only 3 ported in Step 7) AND does non-container setup (secret gen, UID-mapping chowns, Tor hostnames, WireGuard, firewall, nostr-relay dir):
- **8a** (cheap, safe): delete `scripts/reconcile-containers.sh` + `scripts/container-specs.sh` + `image-recipe/configs/archipelago-reconcile.{service,timer}` + their ISO-builder touchpoints. `BootReconciler` fully replaces these — no manifest porting required. Atomic commit, low risk.
- **8b** (large, deferred): port the remaining ~25 container creations from `first-boot-containers.sh` into `apps/<id>/manifest.yml` files. One manifest per commit, validated against current bash behavior (ports, volumes, env, deps, health checks, post-create wallet/db bootstrap). Probably 1-2 days of careful porting. Includes `apps/filebrowser/manifest.yml`.
- **8c** (final, one-way door): rename `first-boot-containers.sh``first-boot-setup.sh`, strip out all `$DOCKER run/pull/exec` calls, keep only secret generation + dir prep + Tor/WG/firewall/nostr setup. Rename `archipelago-first-boot-containers.service``archipelago-first-boot-setup.service`. Add ISO builder lines to copy `apps/*/manifest.yml``/opt/archipelago/apps/`. Full ISO build test on .116 required before commit.
9. **Live test on .228**: hot-swap binary, expect 3 UIs to come up within 60s of service restart.
10. **Live test on .116**: hot-swap binary, expect zero container recreation + adoption-confirmed log lines.
11. **Chaos matrix** on both nodes.