archy/loop/prompt.md at main

lfg2025/archy

Fork 0

Files

Dorian 64b57dca7d

Build Archipelago ISO (dev) / build-iso (push) Failing after 13m44s

Details

Container Orchestration Tests / unit-tests (push) Failing after 7m30s

Details

Container Orchestration Tests / smoke-tests (push) Has been skipped

Details

fix: overhaul container lifecycle — recovery, health, uninstall, UI state

Container recovery:
- Health monitor: MAX_RESTART_ATTEMPTS 3→10, interval 60s→120s
- Dependency-aware restarts: won't restart services before their deps
- Reset dependent counters when a dependency recovers
- Handle "created" state containers (were invisible to health monitor)
- Added IndeedHub, mempool-api, mysql to tier system
- Crash recovery: podman start timeout 30s→120s with retry
- Podman client: socket timeout 5s→30s, added restart policy

UI state representation:
- Exit code 0 shows "stopped" (gray), not "crashed" (red)
- Exit code 137 shows "killed (OOM)"
- Non-zero exit shows "crashed" (red)
- Added exit_code field to PackageDataEntry

Install/uninstall fixes:
- Install returns error when container doesn't start (was silent success)
- Post-install hooks awaited instead of fire-and-forget tokio::spawn
- Uninstall: graceful rm before force, volume prune, network cleanup
- Uninstall returns error on partial failure (was 200 OK)

Config consistency:
- DB passwords read from /var/lib/archipelago/secrets/ (was hardcoded)
- Bitcoin: added ZMQ ports 28332/28333 for LND block notifications
- IndeedHub port 7777→8190 (was conflicting with strfry)
- Marketplace versions: LND 0.17.4→0.18.4, Mempool 2.5.0→3.0.0

Performance:
- Metrics collector interval 60s→300s (was duplicating health monitor)
- Podman client: proper error propagation instead of unwrap_or_default

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-31 07:03:57 +01:00

2.4 KiB

Raw Permalink Blame History

You are working through an overnight automation plan for the Archipelago (archy) project. Read these files first:

loop/plan.md -- Your task checklist (mark items - [x] as you complete them)
CLAUDE.md -- Project conventions, architecture, and coding standards

Working Process

For each task in loop/plan.md:

Find the first unchecked - [ ] item
Read the task description carefully
Read the relevant source files before making changes
Implement following CLAUDE.md conventions
Run any test/build commands specified in the task
Fix all errors before continuing
Commit with conventional format: type: description
Mark it done - [x] in loop/plan.md
Move to the next unchecked task immediately

Critical Rules

Deploy-test-fix LOOPS: Many tasks require you to deploy, test, find failures, fix them, redeploy, and retest. Do NOT mark a task complete until ALL tests in that task pass. If a fix introduces a new failure, fix that too. Keep looping.
Read logs obsessively: After every deploy, read journalctl, podman logs, and curl output. The logs tell you what's broken.
Fix the root cause: Don't patch symptoms. If a container won't restart, find out WHY (wrong restart policy? health check failing? missing dependency?) and fix the actual cause.
Never skip a testing gate -- if tests fail, fix before moving on
If a task is proving difficult, make at least 10 genuine attempts before moving on
Always read source files before editing them
Do not stop until all tasks are checked or you are rate limited
Commit after each completed fix (multiple commits per task is fine)
DO NOT PUSH -- a CI build is in progress, we will push manually later
Deploy to .228 -- ssh -i ~/.ssh/archipelago-deploy archipelago@192.168.1.228
Run Rust builds/checks on .228, NOT macOS
Production-quality code only -- no shortcuts, no TODO comments, no unwrap()

SSH Quick Reference

SSH="ssh -i ~/.ssh/archipelago-deploy archipelago@192.168.1.228"
# Deploy from macOS:
./scripts/deploy-to-target.sh --target 192.168.1.228
# Build Rust on .228:
$SSH "cd ~/archy/core && cargo clippy --all-targets --all-features && cargo test --all-features"
# Check containers:
$SSH "podman ps -a --format '{{.Names}} {{.State}} {{.Status}}' | sort"
# Read container logs:
$SSH "podman logs bitcoin-knots --tail 30"
# Check backend:
$SSH "journalctl -u archipelago --no-pager -n 50"

2.4 KiB Raw Permalink Blame History

Working Process

Critical Rules

SSH Quick Reference

2.4 KiB

Raw Permalink Blame History