Same shape as bitcoin-knots.bats and lnd.bats so the 20× release-gate
exercises electrumx through the same state matrix it uses for the other
two core apps. electrumx previously had a single TCP-port check inside
required-stack.bats; this adds destructive + cascade-destructive tiers.
10 @test cases:
* read-only: presence, valid state, TCP port (50001) reachable, no
orphan containers beyond {electrumx, archy-electrs-ui}
* destructive: stop, start, restart, TCP port recovers within 120s of
cold restart (longer than bitcoind because electrumx replays its
index against bitcoind on start)
* cascade: uninstall, reinstall (240s timeout for index rebuild)
With this suite, the three single-container core apps (bitcoin-knots,
lnd, electrumx) now have parity coverage. Multi-container stacks
(btcpay, mempool, fedimint) come next.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirrors bitcoin-knots.bats so the 20× release-gate run exercises lnd
through the same state matrix. lnd previously had only a single
read-only check inside required-stack.bats; this adds the destructive
and cascade-destructive tiers that match what we already test for
bitcoin-knots.
10 @test cases:
* read-only: presence, valid state, lncli getinfo, no orphan containers
* destructive (ARCHY_ALLOW_DESTRUCTIVE=1): stop, start, restart,
RPC recovers within 90s of cold restart (longer than bitcoind
because the wallet has to unlock first)
* cascade (ARCHY_ALLOW_CASCADE_DESTRUCTIVE=1): uninstall, reinstall
Reuses the same lncli invocation as required-stack.bats so divergence
shows up clearly if either test breaks.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 4 of the v1.7.52 container excellence plan: a release-gate harness
that loops the bats suite N times in a row, with teardown between
iterations, and reports a pass/fail tally.
* setup-teardown.sh — clears /tmp/archy-rpc-session-* between runs so
iteration N+1 doesn't reuse a logged-out cookie from iteration N.
Idempotent; safe to run anytime. Designed to grow as we add suites
that leave other transient state.
* run-20x.sh — wraps run.sh in a loop of ARCHY_ITERATIONS (default 20).
Tracks per-iteration pass/fail with wall-clock timing, prints a
results block, exits non-zero on any failure. Honors ARCHY_FAIL_FAST
for short-circuit during dev.
Suggested release-gate command:
ARCHY_PASSWORD=password123 ARCHY_ALLOW_DESTRUCTIVE=1 \
tests/lifecycle/run-20x.sh
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Companion UI containers (archy-bitcoin-ui, archy-lnd-ui,
archy-electrs-ui) used to be launched as fire-and-forget tokio::spawn
blocks from install.rs. If archipelago crashed mid-spawn or the
container's cgroup was reaped, companions vanished from podman ps -a
and only a manual rm/run could bring them back (the .228 incident).
Now each companion is rendered as a Quadlet .container unit under
~/.config/containers/systemd/, daemon-reloaded, and started via
systemctl --user. systemd owns supervision from that point on:
- archipelago can crash, restart, or be uninstalled without touching
any companion.
- Quadlet's Restart=always + RestartSec=10 handles container exits.
- A 30s reconcile tick in boot_reconciler enumerates expected
companion units and re-installs any whose unit file or service
vanished — defense-in-depth against external tampering.
New module layout:
- container/quadlet.rs: pure unit renderer + atomic write_if_changed
+ systemctl helpers (daemon_reload_user / enable_now / disable_remove
/ is_active). 6 unit tests, no I/O in the renderer.
- container/companion.rs: per-app companion specs, install/remove/
reconcile, image presence (build local first, fall back to insecure
registry only via image_uses_insecure_registry whitelist). 2 tests.
install.rs handle_package_install now ends with a single call to
companion::install_for(package_id), replacing 287 lines of spawn-and-
hope shellouts plus a ~120-line nginx auth-injector helper that worked
around per-node RPC password baking. The helper is gone too — the
pre-start hook renders the per-node nginx.conf to /var/lib/archipelago/
bitcoin-ui/nginx.conf and the Quadlet unit bind-mounts it read-only.
runtime.rs handle_package_uninstall now disables companions before
the container rm loop. Otherwise systemd's Restart=always would
respawn each companion within ~10s of removal.
Tests: 53 container tests pass, including 6 quadlet renderer tests
(host network, bridge network, capability set, atomic write idempotence)
and 2 companion specs (per-app companion lookup, build_unit shape).
boot_reconciler tests gain a #[cfg(test)] without_companion_stage()
flag so the paused-clock fixtures don't race the real systemctl I/O.
A bats regression test (companion-survives-archipelago-restart.bats,
gated on ARCHY_ALLOW_DESTRUCTIVE=1) asserts the .228 failure mode
cannot recur: every installed companion has a unit file, services
stay active across systemctl --user restart archipelago, and a
deleted unit file is recreated within one reconcile tick.
Net delta: +941 / -363, but the +941 is mostly tests (~440 lines)
and the new declarative layer; the imperative tokio::spawn block and
its nginx-auth helper are gone, removing two failure classes
(orphan companions on archipelago crash, and post-start exec races
under tightly-confined cgroups) that previously needed manual SSH
recovery.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>