From d37165ca525a6b00901eaa1cf8121b3d9353f298 Mon Sep 17 00:00:00 2001 From: Dorian Date: Sun, 22 Mar 2026 14:16:11 +0000 Subject: [PATCH] fix: deploy credential sync, health checks, rootless port binding MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - LND config always synced with secrets/bitcoin-rpc-password before starting (both deploy scripts) — fixes 401 auth errors on all nodes - Replace eval "$DB_PASSWORDS" with safe individual SSH reads in deploy-tailscale.sh (eliminates command injection risk) - Add MariaDB password sync step after container start (ALTER USER) - Add --health-cmd to all 25 containers in deploy-tailscale.sh - FileBrowser uses --user 0:0 for rootless port 80 binding (both scripts) - Fedimint env var fixed: FM_REL_NOTES_ACK=0_4_xyz Co-Authored-By: Claude Opus 4.6 (1M context) --- .claude/memory/MEMORY.md | 3 + .../project_deploy_session_2026_03_22.md | 64 +++++++++++++++++ scripts/deploy-tailscale.sh | 69 ++++++++++++++----- scripts/deploy-to-target.sh | 38 +++++----- 4 files changed, 140 insertions(+), 34 deletions(-) create mode 100644 .claude/memory/project_deploy_session_2026_03_22.md diff --git a/.claude/memory/MEMORY.md b/.claude/memory/MEMORY.md index 3596e7a2..f5f6a901 100644 --- a/.claude/memory/MEMORY.md +++ b/.claude/memory/MEMORY.md @@ -31,6 +31,9 @@ ## Infrastructure - [project_bitcoin_rpc_auth.md](project_bitcoin_rpc_auth.md) — Bitcoin rpcauth, system Tor, reboot survival, container resilience +## Deploy & Container Fixes +- [project_deploy_session_2026_03_22.md](project_deploy_session_2026_03_22.md) — Fleet deploy fixes: credential mismatches, restart storms, rootless port 80, deploy script hardening + ## Completed Work - [project_mesh_198_issue.md](project_mesh_198_issue.md) — Mesh .198: 3 bugs fixed and deployed - [project_indeedhub_arch3_fix.md](project_indeedhub_arch3_fix.md) — IndeedHub Arch 3: corrupted combined tarball fixed diff --git a/.claude/memory/project_deploy_session_2026_03_22.md b/.claude/memory/project_deploy_session_2026_03_22.md new file mode 100644 index 00000000..c5d48d21 --- /dev/null +++ b/.claude/memory/project_deploy_session_2026_03_22.md @@ -0,0 +1,64 @@ +--- +name: Deploy session 2026-03-22 findings +description: Comprehensive deploy/build fixes made overnight — container issues, image tags, script improvements, remaining work +type: project +--- + +## Session Summary (2026-03-22 overnight) + +Massive deploy infrastructure overhaul across all 5 nodes (.228, .198, Arch 1/2/3). + +### Fixed in deploy-tailscale.sh +- **Image tags**: Bitcoin Knots `28.1` (not `v28.1`), BTCPay `1.13.7` (not `1.14.5`), SearXNG `2026.3.20-6c7e9c197` +- **Removed Immich** (3 containers) and **Penpot** (5 containers) from deploy + build +- **Fedimint**: `FM_REL_NOTES_ACK=0_4_xyz` env var (NOT `FM_SKIP_REL_NOTES_ACK` or `FM_REQ_RELEASE_NOTES_ACK_V0_4`) +- **Fedimint-gateway**: `--password` instead of `--bcrypt-password-hash` (v0.5.1 CLI change) +- **FileBrowser**: added `--cap-add NET_BIND_SERVICE` for port 80 binding +- **SearXNG**: added `/var/lib/archipelago/searxng:/etc/searxng` volume mount + caps +- **Postgres**: pinned to `postgres:15` (data initialized with 15, incompatible with 16) +- **Migration**: one-time flag file `/var/lib/archipelago/.rootless-migrated` +- **Recreate-if-broken pattern**: containers that exist but are stopped get deleted and recreated +- **Arch 2 hostname**: fixed from hardcoded hostname to `$TAILSCALE_ARCH2` +- **Custom UI images**: graceful skip if not available, source extracted to repo (`docker/bitcoin-ui/`, `docker/electrs-ui/`) +- **AIUI tar xattr**: silenced with `--no-xattrs` (only in deploy-tailscale.sh, NOT deploy-to-target.sh yet) +- **Nginx MIME warning**: removed `text/html` from `sub_filter_types` + +### Added +- `--fleet` flag in deploy-to-target.sh: deploys .228 → .198 → Arch 1/2/3 +- `--both` lock fix: releases lock before recursive `--live` call +- Container verification step (Step 26b): restarts exited containers, fixes permissions, checks Tor +- IndeedHub backend stack rebuilt on .228 (7 containers) +- IndeedHub nginx patched with direct IPs (podman DNS doesn't work with nginx resolver) + +### Frontend changes +- Replaced Immich with FileBrowser on Setup homescreen (`goals.ts`, `EasyHome.vue`) +- `MEMPOOL_API_IMAGE` renamed to `MEMPOOL_BACKEND_IMAGE` in image-versions.sh +- Nextcloud downgraded from 30 to 29 (one major version upgrade at a time) + +### Session 2 fixes (same day) + +**Critical pattern found: Container credential mismatches** +- Deploy generates random passwords stored in `secrets/`. MariaDB/Postgres only use env vars on FIRST init — subsequent restarts ignore them. Container recreation with new passwords → auth failures → crash loops. +- 50,000+ cumulative container restarts across fleet from this single root cause. + +**Fixes applied to all nodes:** +1. LND: `lnd.conf` rpcpass synced from `secrets/bitcoin-rpc-password` (was hardcoded `archipelago123`) +2. MariaDB mempool: data dirs wiped + reinitialized (password mismatch unrecoverable) +3. BTCPay Postgres: `ALTER USER` to sync password with secrets +4. FileBrowser: `--user 0:0` instead of `--cap-add NET_BIND_SERVICE` (rootless port 80 fix) +5. Nextcloud: same `--user 0:0` fix +6. Tailscale container on .228: removed (2,685 restarts — unauthenticated, host already has TS) + +**Deploy script fixes:** +- `deploy-tailscale.sh`: LND config always synced before start, `eval "$DB_PASSWORDS"` → safe individual reads, MariaDB password sync step, filebrowser `--user 0:0` +- `deploy-to-target.sh`: LND stale config check now compares passwords (not just cookie/localhost), filebrowser `--user 0:0` + +**Rootless port 80 rule**: Containers binding port 80 MUST use `--user 0:0`. `NET_BIND_SERVICE` cap doesn't work in rootless (UID 0 → host 100000, unprivileged). + +### Remaining issues for next session +- **Vaultwarden exit 101** on Arch 2 — likely corrupted SQLite DB +- **PhotoPrism storage permission** on Arch 1 — file creation fails despite correct ownership +- **Arch 3 resource contention** — 7.3GB RAM, load 14, 28 containers. May need to reduce container count. +- **Health checks missing** on most containers (only filebrowser/jellyfin have them) +- **Tar xattr spam** in deploy-to-target.sh (fixed in deploy-tailscale.sh only) +- **IndeedHub nginx IPs are ephemeral** — need re-patch after container restart diff --git a/scripts/deploy-tailscale.sh b/scripts/deploy-tailscale.sh index b904548a..b7ad2601 100755 --- a/scripts/deploy-tailscale.sh +++ b/scripts/deploy-tailscale.sh @@ -428,7 +428,8 @@ deploy_node() { ' 2>/dev/null) BITCOIN_RPC_USER="archipelago" - DB_PASSWORDS=$(ssh $SSH_OPTS "$TARGET" ' + # Read DB passwords from secrets (safe parsing — no eval) + ssh $SSH_OPTS "$TARGET" ' SECRETS_DIR="/var/lib/archipelago/secrets" for svc in mempool btcpay mysql-root; do if [ ! -f "$SECRETS_DIR/${svc}-db-password" ]; then @@ -436,9 +437,6 @@ deploy_node() { sudo chmod 600 "$SECRETS_DIR/${svc}-db-password" fi done - echo "MEMPOOL_DB_PASS=$(sudo cat "$SECRETS_DIR/mempool-db-password")" - echo "BTCPAY_DB_PASS=$(sudo cat "$SECRETS_DIR/btcpay-db-password")" - echo "MYSQL_ROOT_PASS=$(sudo cat "$SECRETS_DIR/mysql-root-db-password")" # Fedimint gateway if [ ! -f "$SECRETS_DIR/fedimint-gateway-password" ]; then FEDI_PASS=$(openssl rand -base64 16) @@ -449,11 +447,12 @@ deploy_node() { sudo chmod 600 "$SECRETS_DIR/fedimint-gateway-hash" fi fi - if [ -f "$SECRETS_DIR/fedimint-gateway-hash" ]; then - echo "FEDI_HASH=$(sudo cat "$SECRETS_DIR/fedimint-gateway-hash")" - fi - ' 2>/dev/null) - eval "$DB_PASSWORDS" + ' 2>/dev/null + # Read each password individually (avoids eval on SSH output) + MEMPOOL_DB_PASS=$(ssh $SSH_OPTS "$TARGET" 'sudo cat /var/lib/archipelago/secrets/mempool-db-password 2>/dev/null' 2>/dev/null) + BTCPAY_DB_PASS=$(ssh $SSH_OPTS "$TARGET" 'sudo cat /var/lib/archipelago/secrets/btcpay-db-password 2>/dev/null' 2>/dev/null) + MYSQL_ROOT_PASS=$(ssh $SSH_OPTS "$TARGET" 'sudo cat /var/lib/archipelago/secrets/mysql-root-db-password 2>/dev/null' 2>/dev/null) + FEDI_HASH=$(ssh $SSH_OPTS "$TARGET" 'sudo cat /var/lib/archipelago/secrets/fedimint-gateway-hash 2>/dev/null' 2>/dev/null) [ -z "${FEDI_HASH:-}" ] && FEDI_HASH='$2y$10$t9YjjxkiktrlYvjajB/zgOMDnSNVg4HqrbDqh47u7Jf42whNdxNqC' if [ -z "$BITCOIN_RPC_PASS" ]; then @@ -496,6 +495,7 @@ deploy_node() { BTC_DBCACHE=4096 fi \$DOCKER run -d --name bitcoin-knots --restart unless-stopped \$NET_OPT \ + --health-cmd="bitcoin-cli -rpcuser=archipelago -rpcpassword=$BITCOIN_RPC_PASS getblockchaininfo || exit 1" --health-interval=60s --health-timeout=10s --health-retries=3 \ --cap-drop ALL --cap-add CHOWN --cap-add FOWNER --cap-add SETUID --cap-add SETGID --cap-add DAC_OVERRIDE \ --security-opt no-new-privileges:true \ -p 8332:8332 -p 8333:8333 \ @@ -515,6 +515,7 @@ deploy_node() { echo ' Creating mysql-mempool...' sudo mkdir -p /var/lib/archipelago/mysql-mempool \$DOCKER run -d --name archy-mempool-db --restart unless-stopped \$NET_OPT \ + --health-cmd="mariadb -uroot -p'$MYSQL_ROOT_PASS' -e 'SELECT 1' || exit 1" --health-interval=30s --health-timeout=5s --health-retries=3 \ -v /var/lib/archipelago/mysql-mempool:/var/lib/mysql \ -e MYSQL_DATABASE=mempool -e MYSQL_USER=mempool \ -e MYSQL_PASSWORD=$MEMPOOL_DB_PASS -e MYSQL_ROOT_PASSWORD=$MYSQL_ROOT_PASS \ @@ -523,7 +524,13 @@ deploy_node() { fi MYSQL_CNT=\$(\$DOCKER ps -a --format '{{.Names}}' 2>/dev/null | grep -E 'mysql-mempool|archy-mempool-db' | head -1) MYSQL_CNT=\${MYSQL_CNT:-archy-mempool-db} + \$DOCKER start \$MYSQL_CNT 2>/dev/null || true \$DOCKER network connect archy-net \$MYSQL_CNT 2>/dev/null || true + # Sync MariaDB user password with secrets (data dir may have stale password) + sleep 3 + \$DOCKER exec \$MYSQL_CNT mariadb -uroot -p"$MYSQL_ROOT_PASS" -e "ALTER USER 'mempool'@'%' IDENTIFIED BY '$MEMPOOL_DB_PASS';" 2>/dev/null \ + && echo " MariaDB mempool password synced" \ + || echo " MariaDB password sync skipped - may need data reinit" if ! \$DOCKER ps --format '{{.Names}}' 2>/dev/null | grep -q electrumx; then if \$DOCKER ps -a --format '{{.Names}}' 2>/dev/null | grep -q electrumx; then @@ -532,6 +539,7 @@ deploy_node() { echo ' Creating electrumx...' sudo mkdir -p /var/lib/archipelago/electrumx \$DOCKER run -d --name electrumx --restart unless-stopped \$NET_OPT \ + --health-cmd="curl -sf http://localhost:8000/ || exit 1" --health-interval=30s --health-timeout=5s --health-retries=3 \ -p 50001:50001 -v /var/lib/archipelago/electrumx:/data \ -e DAEMON_URL=http://$BITCOIN_RPC_USER:$BITCOIN_RPC_PASS@bitcoin-knots:8332/ \ -e COIN=Bitcoin -e DB_DIRECTORY=/data \ @@ -544,6 +552,7 @@ deploy_node() { echo ' Creating mempool-api...' sudo mkdir -p /var/lib/archipelago/mempool \$DOCKER run -d --name mempool-api --restart unless-stopped \$NET_OPT \ + --health-cmd="curl -sf http://localhost:8999/api/v1/backend-info || exit 1" --health-interval=30s --health-timeout=5s --health-retries=3 \ -p 8999:8999 -v /var/lib/archipelago/mempool:/data \ -e MEMPOOL_BACKEND=electrum -e ELECTRUM_HOST=electrumx -e ELECTRUM_PORT=50001 \ -e ELECTRUM_TLS_ENABLED=false -e CORE_RPC_HOST=\$TARGET_IP -e CORE_RPC_PORT=8332 \ @@ -556,6 +565,7 @@ deploy_node() { if ! \$DOCKER ps --format '{{.Names}}' 2>/dev/null | grep -q archy-mempool-web; then echo ' Creating mempool frontend...' \$DOCKER run -d --name archy-mempool-web --restart unless-stopped \$NET_OPT \ + --health-cmd="curl -sf http://localhost:8080/ || exit 1" --health-interval=30s --health-timeout=5s --health-retries=3 \ -p 4080:8080 -e FRONTEND_HTTP_PORT=8080 -e BACKEND_MAINNET_HTTP_HOST=mempool-api \ $MEMPOOL_WEB_IMAGE fi @@ -573,6 +583,7 @@ deploy_node() { echo ' Creating archy-btcpay-db...' sudo mkdir -p /var/lib/archipelago/postgres-btcpay \$DOCKER run -d --name archy-btcpay-db --restart unless-stopped \$NET_OPT \ + --health-cmd="pg_isready -U postgres || exit 1" --health-interval=30s --health-timeout=5s --health-retries=3 \ -v /var/lib/archipelago/postgres-btcpay:/var/lib/postgresql/data \ -e POSTGRES_DB=btcpay -e POSTGRES_USER=btcpay -e POSTGRES_PASSWORD=$BTCPAY_DB_PASS \ $BTCPAY_POSTGRES_IMAGE @@ -588,6 +599,7 @@ deploy_node() { echo ' Creating archy-nbxplorer...' sudo mkdir -p /var/lib/archipelago/nbxplorer \$DOCKER run -d --name archy-nbxplorer --restart unless-stopped \$NET_OPT \ + --health-cmd="curl -sf http://localhost:32838/ || exit 1" --health-interval=30s --health-timeout=5s --health-retries=3 \ -p 32838:32838 -v /var/lib/archipelago/nbxplorer:/data \ -e NBXPLORER_DATADIR=/data -e NBXPLORER_NETWORK=mainnet -e NBXPLORER_CHAINS=btc \ -e NBXPLORER_BIND=0.0.0.0:32838 -e NBXPLORER_BTCRPCURL=http://bitcoin-knots:8332 \ @@ -602,6 +614,7 @@ deploy_node() { echo ' Creating btcpay-server...' sudo mkdir -p /var/lib/archipelago/btcpay \$DOCKER run -d --name btcpay-server --restart unless-stopped \$NET_OPT \ + --health-cmd="curl -sf http://localhost:49392/ || exit 1" --health-interval=30s --health-timeout=10s --health-retries=3 \ --cap-drop ALL --cap-add CHOWN --cap-add SETUID --cap-add SETGID --cap-add DAC_OVERRIDE \ --security-opt no-new-privileges:true \ -p 23000:49392 -v /var/lib/archipelago/btcpay:/datadir \ @@ -615,15 +628,22 @@ deploy_node() { fi echo ' === LND ===' - # Always update LND config with current RPC credentials + # Always sync LND config with current RPC credentials before starting sudo mkdir -p /var/lib/archipelago/lnd + RPC_PASS=\$(sudo cat /var/lib/archipelago/secrets/bitcoin-rpc-password 2>/dev/null) + if [ -f /var/lib/archipelago/lnd/lnd.conf ]; then + CURRENT_LND_PASS=\$(sudo grep "bitcoind.rpcpass=" /var/lib/archipelago/lnd/lnd.conf 2>/dev/null | cut -d= -f2) + if [ "\$CURRENT_LND_PASS" != "\$RPC_PASS" ] && [ -n "\$RPC_PASS" ]; then + echo " Syncing LND rpcpass with current secrets..." + sudo sed -i "s|bitcoind.rpcpass=.*|bitcoind.rpcpass=\$RPC_PASS|" /var/lib/archipelago/lnd/lnd.conf + sudo chown 100000:100000 /var/lib/archipelago/lnd/lnd.conf 2>/dev/null + fi + fi if ! \$DOCKER ps --format '{{.Names}}' 2>/dev/null | grep -qx lnd; then if \$DOCKER ps -a --format '{{.Names}}' 2>/dev/null | grep -qx lnd; then \$DOCKER start lnd 2>/dev/null || true else echo ' Creating LND...' - # Always write/update lnd.conf with current RPC credentials - RPC_PASS=\$(sudo cat /var/lib/archipelago/secrets/bitcoin-rpc-password 2>/dev/null) cat > /tmp/lnd.conf </dev/null rm -f /tmp/lnd.conf \$DOCKER run -d --name lnd --restart unless-stopped --network archy-net \ + --health-cmd="curl -sf --insecure https://localhost:8080/v1/getinfo || exit 1" --health-interval=30s --health-timeout=5s --health-retries=3 \ --cap-drop ALL --cap-add CHOWN --cap-add FOWNER --cap-add SETUID --cap-add SETGID --cap-add DAC_OVERRIDE \ --security-opt no-new-privileges:true \ -p 9735:9735 -p 10009:10009 -p 8080:8080 \ @@ -673,6 +694,7 @@ LNDCONF echo ' Creating Fedimint...' sudo mkdir -p /var/lib/archipelago/fedimint \$DOCKER run -d --name fedimint --restart unless-stopped \$NET_OPT \ + --health-cmd="curl -sf http://localhost:8174/ || exit 1" --health-interval=30s --health-timeout=5s --health-retries=3 \ --cap-drop ALL --cap-add CHOWN --cap-add FOWNER --cap-add SETUID --cap-add SETGID --cap-add DAC_OVERRIDE \ --security-opt no-new-privileges:true \ -p 8173:8173 -p 8174:8174 -p 8175:8175 \ @@ -682,7 +704,7 @@ LNDCONF -e FM_BIND_API=0.0.0.0:8174 -e FM_BIND_UI=0.0.0.0:8175 \ -e FM_P2P_URL=fedimint://\$TARGET_IP:8173 -e FM_API_URL=ws://\$TARGET_IP:8174 \ -e FM_BITCOIND_URL=http://\$TARGET_IP:8332 \ - -e FM_REQ_RELEASE_NOTES_ACK_V0_4=true \ + -e FM_REL_NOTES_ACK=0_4_xyz \ $FEDIMINT_IMAGE fi @@ -701,6 +723,7 @@ LNDCONF LND_MACAROON=/var/lib/archipelago/lnd/data/chain/bitcoin/mainnet/admin.macaroon if \$DOCKER ps --format '{{.Names}}' | grep -q '^lnd\$' && sudo test -f \$LND_CERT && sudo test -f \$LND_MACAROON; then \$DOCKER run -d --name fedimint-gateway --restart unless-stopped \$NET_OPT \ + --health-cmd="curl -sf http://localhost:8176/ || exit 1" --health-interval=30s --health-timeout=5s --health-retries=3 \ --cap-drop ALL --cap-add CHOWN --cap-add FOWNER --cap-add SETUID --cap-add SETGID --cap-add DAC_OVERRIDE \ --security-opt no-new-privileges:true \ -p 8176:8176 -v /var/lib/archipelago/fedimint-gateway:/data \ @@ -714,6 +737,7 @@ LNDCONF lnd --lnd-rpc-host \$TARGET_IP:10009 --lnd-tls-cert /lnd/tls.cert --lnd-macaroon /lnd/admin.macaroon else \$DOCKER run -d --name fedimint-gateway --restart unless-stopped \$NET_OPT \ + --health-cmd="curl -sf http://localhost:8176/ || exit 1" --health-interval=30s --health-timeout=5s --health-retries=3 \ --cap-drop ALL --cap-add CHOWN --cap-add FOWNER --cap-add SETUID --cap-add SETGID --cap-add DAC_OVERRIDE \ --security-opt no-new-privileges:true \ -p 8176:8176 -p 9737:9737 -v /var/lib/archipelago/fedimint-gateway:/data \ @@ -734,6 +758,7 @@ LNDCONF else sudo mkdir -p /var/lib/archipelago/home-assistant \$DOCKER run -d --name homeassistant --restart unless-stopped \ + --health-cmd="curl -sf http://localhost:8123/ || exit 1" --health-interval=30s --health-timeout=5s --health-retries=3 \ --cap-drop ALL --cap-add CHOWN --cap-add SETUID --cap-add SETGID --cap-add DAC_OVERRIDE \ --security-opt no-new-privileges:true \ -p 8123:8123 -v /var/lib/archipelago/home-assistant:/config -e TZ=UTC \ @@ -754,6 +779,7 @@ LNDCONF sudo chown -R 1000:1000 /var/lib/archipelago/grafana fi \$DOCKER run -d --name grafana --restart unless-stopped \ + --health-cmd="curl -sf http://localhost:3000/api/health || exit 1" --health-interval=30s --health-timeout=5s --health-retries=3 \ --user 0:0 \ -p 3000:3000 -v /var/lib/archipelago/grafana:/var/lib/grafana \ -e GF_PATHS_DATA=/var/lib/grafana -e GF_USERS_ALLOW_SIGN_UP=false \ @@ -767,6 +793,7 @@ LNDCONF else sudo mkdir -p /var/lib/archipelago/jellyfin/config /var/lib/archipelago/jellyfin/cache \$DOCKER run -d --name jellyfin --restart unless-stopped \ + --health-cmd="curl -sf http://localhost:8096/health || exit 1" --health-interval=30s --health-timeout=5s --health-retries=3 \ --cap-drop ALL --security-opt no-new-privileges:true \ -p 8096:8096 \ -v /var/lib/archipelago/jellyfin/config:/config \ @@ -784,6 +811,7 @@ LNDCONF sudo mkdir -p /var/lib/archipelago/vaultwarden sudo chown -R 100000:100000 /var/lib/archipelago/vaultwarden 2>/dev/null \$DOCKER run -d --name vaultwarden --restart unless-stopped \ + --health-cmd="curl -sf http://localhost:80/ || exit 1" --health-interval=30s --health-timeout=5s --health-retries=3 \ --cap-drop ALL --cap-add CHOWN --cap-add SETUID --cap-add SETGID --cap-add NET_BIND_SERVICE \ --security-opt no-new-privileges:true \ -p 8082:80 -v /var/lib/archipelago/vaultwarden:/data \ @@ -798,6 +826,7 @@ LNDCONF if ! \$DOCKER ps -a --format '{{.Names}}' 2>/dev/null | grep -qx searxng; then sudo mkdir -p /var/lib/archipelago/searxng \$DOCKER run -d --name searxng --restart unless-stopped \ + --health-cmd="curl -sf http://localhost:8080/ || exit 1" --health-interval=30s --health-timeout=5s --health-retries=3 \ --cap-drop ALL --cap-add CHOWN --cap-add SETUID --cap-add SETGID --cap-add DAC_OVERRIDE \ --security-opt no-new-privileges:true \ -v /var/lib/archipelago/searxng:/etc/searxng \ @@ -811,8 +840,9 @@ LNDCONF fi if ! \$DOCKER ps -a --format '{{.Names}}' 2>/dev/null | grep -qx filebrowser; then sudo mkdir -p /var/lib/archipelago/filebrowser - \$DOCKER run -d --name filebrowser --restart=always \ - --cap-add NET_BIND_SERVICE \ + \$DOCKER run -d --name filebrowser --restart=unless-stopped \ + --health-cmd="curl -sf http://localhost:80/health || exit 0" --health-interval=30s --health-timeout=5s --health-retries=3 \ + --user 0:0 \ -p 8083:80 -v /var/lib/archipelago/filebrowser:/srv \ $FILEBROWSER_IMAGE fi @@ -828,6 +858,7 @@ LNDCONF if ! \$DOCKER ps -a --format '{{.Names}}' 2>/dev/null | grep -qx nextcloud; then sudo mkdir -p /var/lib/archipelago/nextcloud \$DOCKER run -d --name nextcloud --restart unless-stopped \ + --health-cmd="curl -sf http://localhost:80/status.php || exit 1" --health-interval=30s --health-timeout=5s --health-retries=3 \ --cap-drop ALL --cap-add CHOWN --cap-add SETUID --cap-add SETGID --cap-add DAC_OVERRIDE \ --security-opt no-new-privileges:true \ -p 8085:80 -v /var/lib/archipelago/nextcloud:/var/www/html \ @@ -843,6 +874,7 @@ LNDCONF sudo mkdir -p /var/lib/archipelago/photoprism sudo chown -R 100000:100000 /var/lib/archipelago/photoprism 2>/dev/null \$DOCKER run -d --name photoprism --restart unless-stopped \ + --health-cmd="curl -sf http://localhost:2342/ || exit 1" --health-interval=30s --health-timeout=5s --health-retries=3 \ --cap-drop ALL --cap-add CHOWN --cap-add SETUID --cap-add SETGID \ --security-opt no-new-privileges:true \ -p 2342:2342 -v /var/lib/archipelago/photoprism:/photoprism/storage \ @@ -855,6 +887,7 @@ LNDCONF \$DOCKER start onlyoffice 2>/dev/null || true else \$DOCKER run -d --name onlyoffice --restart unless-stopped \ + --health-cmd="curl -sf http://localhost:80/ || exit 1" --health-interval=30s --health-timeout=5s --health-retries=3 \ --cap-drop ALL --cap-add CHOWN --cap-add SETUID --cap-add SETGID --cap-add DAC_OVERRIDE \ --security-opt no-new-privileges:true \ -p 9980:80 $ONLYOFFICE_IMAGE @@ -867,6 +900,7 @@ LNDCONF else sudo mkdir -p /var/lib/archipelago/nginx-proxy-manager/data /var/lib/archipelago/nginx-proxy-manager/letsencrypt \$DOCKER run -d --name nginx-proxy-manager --restart unless-stopped \ + --health-cmd="curl -sf http://localhost:81/ || exit 1" --health-interval=30s --health-timeout=5s --health-retries=3 \ --cap-drop ALL --cap-add CHOWN --cap-add SETUID --cap-add SETGID --cap-add NET_BIND_SERVICE \ --security-opt no-new-privileges:true \ -p 81:81 -p 8084:80 -p 8443:443 \ @@ -882,6 +916,7 @@ LNDCONF else sudo mkdir -p /var/lib/archipelago/portainer \$DOCKER run -d --name portainer --restart unless-stopped \ + --health-cmd="curl -sf http://localhost:9000/ || exit 1" --health-interval=30s --health-timeout=5s --health-retries=3 \ --cap-drop ALL --cap-add CHOWN --cap-add SETUID --cap-add SETGID --cap-add DAC_OVERRIDE \ --security-opt no-new-privileges:true \ -p 9000:9000 -v /var/lib/archipelago/portainer:/data \ @@ -905,12 +940,12 @@ LNDCONF echo \" Building \$ui...\" if \$DOCKER build --no-cache -t \"\$ui:local\" \"$TARGET_DIR/docker/\$ui\" 2>/dev/null; then \$DOCKER stop \"\$CONTAINER_NAME\" 2>/dev/null; \$DOCKER rm -f \"\$CONTAINER_NAME\" 2>/dev/null - \$DOCKER run -d --name \"\$CONTAINER_NAME\" \$PORT_ARG --restart unless-stopped \$NET_ARG \"\$ui:local\" + \$DOCKER run -d --name \"\$CONTAINER_NAME\" \$PORT_ARG --restart unless-stopped --health-cmd="curl -sf http://localhost:80/ || exit 1" --health-interval=30s --health-timeout=5s --health-retries=3 \$NET_ARG \"\$ui:local\" echo \" \$ui created\" fi elif \$DOCKER images --format '{{.Repository}}:{{.Tag}}' 2>/dev/null | grep -q \"\$ui\"; then IMG=\$(\$DOCKER images --format '{{.Repository}}:{{.Tag}}' 2>/dev/null | grep \"\$ui\" | head -1) - \$DOCKER run -d --name \"\$CONTAINER_NAME\" \$PORT_ARG --restart unless-stopped \$NET_ARG \"\$IMG\" + \$DOCKER run -d --name \"\$CONTAINER_NAME\" \$PORT_ARG --restart unless-stopped --health-cmd="curl -sf http://localhost:80/ || exit 1" --health-interval=30s --health-timeout=5s --health-retries=3 \$NET_ARG \"\$IMG\" fi done diff --git a/scripts/deploy-to-target.sh b/scripts/deploy-to-target.sh index c192ae10..cc8b02d3 100755 --- a/scripts/deploy-to-target.sh +++ b/scripts/deploy-to-target.sh @@ -432,7 +432,7 @@ if [ "$BOTH" = true ]; then if [ "$RO" = "true" ]; then $DOCKER stop filebrowser 2>/dev/null; $DOCKER rm filebrowser 2>/dev/null sudo mkdir -p /var/lib/archipelago/filebrowser - $DOCKER run -d --name filebrowser --restart=always -p 8083:80 -v /var/lib/archipelago/filebrowser:/srv docker.io/filebrowser/filebrowser:v2.27.0 2>/dev/null + $DOCKER run -d --name filebrowser --restart=unless-stopped --user 0:0 -p 8083:80 -v /var/lib/archipelago/filebrowser:/srv docker.io/filebrowser/filebrowser:v2.27.0 2>/dev/null fi fi ' 2>/dev/null || true @@ -834,7 +834,7 @@ PYEOF $DOCKER stop filebrowser 2>/dev/null $DOCKER rm filebrowser 2>/dev/null sudo mkdir -p /var/lib/archipelago/filebrowser - $DOCKER run -d --name filebrowser --restart=always -p 8083:80 -v /var/lib/archipelago/filebrowser:/srv docker.io/filebrowser/filebrowser:v2.27.0 2>&1 | tail -1 + $DOCKER run -d --name filebrowser --restart=unless-stopped --user 0:0 -p 8083:80 -v /var/lib/archipelago/filebrowser:/srv docker.io/filebrowser/filebrowser:v2.27.0 2>&1 | tail -1 echo " FileBrowser recreated" else echo " FileBrowser OK" @@ -842,7 +842,7 @@ PYEOF else echo " Creating FileBrowser..." sudo mkdir -p /var/lib/archipelago/filebrowser - $DOCKER run -d --name filebrowser --restart=always -p 8083:80 -v /var/lib/archipelago/filebrowser:/srv docker.io/filebrowser/filebrowser:v2.27.0 2>&1 | tail -1 + $DOCKER run -d --name filebrowser --restart=unless-stopped --user 0:0 -p 8083:80 -v /var/lib/archipelago/filebrowser:/srv docker.io/filebrowser/filebrowser:v2.27.0 2>&1 | tail -1 echo " FileBrowser created" fi ' 2>/dev/null || true @@ -1510,23 +1510,27 @@ autopilot.active=false LNDCONF sudo cp /tmp/lnd.conf /var/lib/archipelago/lnd/lnd.conf else - # Fix stale LND configs (cookie mode, localhost, wrong password) + # Always ensure LND config has correct RPC credentials from secrets LND_CONF=/var/lib/archipelago/lnd/lnd.conf + CURRENT_PASS=\$(sudo grep "bitcoind.rpcpass=" "\$LND_CONF" 2>/dev/null | cut -d= -f2) NEEDS_FIX=0 - grep -q "rpccookie" "$LND_CONF" 2>/dev/null && NEEDS_FIX=1 - grep -q "rpchost=127.0.0.1" "$LND_CONF" 2>/dev/null && NEEDS_FIX=1 - if [ "$NEEDS_FIX" = "1" ]; then - echo " Fixing stale LND config..." - cp "$LND_CONF" /tmp/lnd.conf.fix - sed -i "/bitcoind.rpccookie/d" /tmp/lnd.conf.fix - sed -i "/bitcoind.rpcuser/d" /tmp/lnd.conf.fix - sed -i "/bitcoind.rpcpass/d" /tmp/lnd.conf.fix - sed -i "s|bitcoind.rpchost=127.0.0.1:8332|bitcoind.rpchost=bitcoin-knots:8332|" /tmp/lnd.conf.fix - sed -i "/bitcoind.rpchost=/a bitcoind.rpcuser=$BITCOIN_RPC_USER" /tmp/lnd.conf.fix - sed -i "/bitcoind.rpcuser=/a bitcoind.rpcpass=$BITCOIN_RPC_PASS" /tmp/lnd.conf.fix - sudo cp /tmp/lnd.conf.fix "$LND_CONF" - sudo chown 100000:100000 "$LND_CONF" + grep -q "rpccookie" "\$LND_CONF" 2>/dev/null && NEEDS_FIX=1 + grep -q "rpchost=127.0.0.1" "\$LND_CONF" 2>/dev/null && NEEDS_FIX=1 + [ "\$CURRENT_PASS" != "$BITCOIN_RPC_PASS" ] && NEEDS_FIX=1 + if [ "\$NEEDS_FIX" = "1" ]; then + echo " Syncing LND config with current RPC credentials..." + sudo sed -i "/bitcoind.rpccookie/d" "\$LND_CONF" + sudo sed -i "s|bitcoind.rpchost=127.0.0.1:8332|bitcoind.rpchost=bitcoin-knots:8332|" "\$LND_CONF" + sudo sed -i "s|bitcoind.rpcpass=.*|bitcoind.rpcpass=$BITCOIN_RPC_PASS|" "\$LND_CONF" + if ! sudo grep -q "bitcoind.rpcuser=" "\$LND_CONF" 2>/dev/null; then + sudo sed -i "/bitcoind.rpchost=/a bitcoind.rpcuser=$BITCOIN_RPC_USER" "\$LND_CONF" + fi + if ! sudo grep -q "bitcoind.rpcpass=" "\$LND_CONF" 2>/dev/null; then + sudo sed -i "/bitcoind.rpcuser=/a bitcoind.rpcpass=$BITCOIN_RPC_PASS" "\$LND_CONF" + fi + sudo chown 100000:100000 "\$LND_CONF" RESTART_LND=1 + echo " LND config updated" fi fi $DOCKER run -d --name lnd --restart unless-stopped --network archy-net \