Compare commits

...

3 Commits

Author SHA1 Message Date
Dorian
3682855668 fix: rootless UID mapping corrections + credential injection
- Correct off-by-one in UID mapping: container UID N → host UID
  (100000 + N - 1), not (100000 + N)
- Deploy script auto-fixes UID ownership on every deploy
- Bitcoin UI nginx uses __BITCOIN_RPC_AUTH__ placeholder injected
  from secrets at deploy time
- container rules updated for rootless podman architecture

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 15:57:16 +00:00
Dorian
93c2c3ee67 fix: deploy script credential injection + container state mapping
- Bitcoin UI nginx: use __BITCOIN_RPC_AUTH__ placeholder, injected at
  deploy time from secrets file (fixes auth prompt regression)
- Deploy script: sed-replaces placeholder with real base64 RPC creds
  before building bitcoin-ui Docker image
- Container state: "created" → "stopped" (not "starting") so ollama/
  tailscale show correctly
- Comprehensive INSTALLED_ALIASES for marketplace

All container credentials now flow from secrets files through the
deploy script. Manual container recreation is no longer needed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 15:31:17 +00:00
Dorian
cc8a6fd4d8 fix: container state mapping + marketplace install aliases
- Created containers now show as "stopped" not "starting" (fixes
  ollama/tailscale perpetual "starting" state)
- Comprehensive INSTALLED_ALIASES map: fedimint, electrumx, grafana,
  jellyfin, vaultwarden, searxng, homeassistant, photoprism, lnd,
  filebrowser, tailscale, ollama — prevents marketplace showing
  "Install" for already-installed containers

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 15:18:43 +00:00
12 changed files with 2193 additions and 134 deletions

View File

@@ -5,15 +5,46 @@ globs:
- "**/*podman*"
- "**/Containerfile"
- "**/Dockerfile"
- "**/first-boot*"
- "**/container-doctor*"
---
# Container Security Rules (Archipelago)
# Container Security Rules (Archipelago — Rootless Podman)
- `readonly_root: true` always — containers must not write to their root filesystem
## Rootless Podman Architecture
- Podman runs as `archipelago` user (UID 1000), NOT root — never use `sudo podman`
- UID namespace mapping via subuid: container UID N → host UID (100000 + N)
- Container images stored in `~/.local/share/containers/storage/` (NOT /var/lib/containers)
- Container subnet: `10.89.0.0/16` (rootless), not `10.88.0.0/16` (rootful)
- XDG_RUNTIME_DIR must be `/run/user/1000` — required for podman socket
- `loginctl enable-linger archipelago` required for containers to survive logout
## Container Security (Non-Negotiable)
- Drop ALL capabilities, add only what's required (`--cap-drop=ALL --cap-add=...`)
- Run as non-root user (UID > 1000): `--user 1001:1001`
- Set `--security-opt=no-new-privileges:true`
- Pin image versions by SHA256 digest, never use `:latest` tag
- Set `--security-opt=no-new-privileges:true` on all containers
- Use `--read-only` + tmpfs where possible (safe apps: searxng, grafana, filebrowser, electrumx, nostr-rs-relay, ollama, indeedhub)
- Pin image versions never use `:latest` tag
- Mount secrets as read-only files, never pass as environment variables when possible
- Set memory and CPU limits on all containers
- Use `--network=none` unless network access is required
- All containers must have `--restart unless-stopped`
## Volume Ownership (Critical for Rootless)
- Volume directories must be owned by the MAPPED UID, not the container UID
- Formula: `host_uid = 100000 + container_uid`
- UID 0 (most apps) → `sudo chown -R 100000:100000 /var/lib/archipelago/{app}`
- UID 101 (bitcoin) → `sudo chown -R 100101:100101 /var/lib/archipelago/bitcoin`
- UID 70 (postgres) → `sudo chown -R 100070:100070 /var/lib/archipelago/postgres-*`
- UID 472 (grafana) → `sudo chown -R 100472:100472 /var/lib/archipelago/grafana`
- UID 999 (mariadb) → `sudo chown -R 100999:100999 /var/lib/archipelago/mysql-*`
## Systemd Service Requirements
- `ProtectHome=no` — podman needs `~/.local/share/containers/`
- `PrivateTmp=no` — podman runtime uses `/tmp/podman-run-1000/`
- `RestrictNamespaces=` must NOT be set — rootless podman creates user namespaces
- `SystemCallFilter=` must NOT be set — rootless podman needs clone/unshare
- UFW `DEFAULT_FORWARD_POLICY="ACCEPT"` — required for LAN access to container ports
## Network Rules
- Apps needing inter-container DNS: use `--network=archy-net` (bitcoin, lnd, electrumx, mempool, btcpay, fedimint)
- Standalone apps: default bridge network
- Tailscale only: `--network=host` + `NET_ADMIN` + `NET_RAW` + `/dev/net/tun`

View File

@@ -4,6 +4,7 @@ description: >
Comprehensive Podman container diagnostic for Archipelago. Audits all running containers,
port mappings, network connectivity, health status, restart policies, and config consistency
across all 4 layers (backend Rust, Podman runtime, Nginx proxy, frontend routing).
Handles rootless Podman (user: archipelago, UID 1000, subuid 100000:65536).
Use when asked to "diagnose containers", "check podman", "why is app not working",
"container health check", "port not reachable", "audit containers", "podman status",
or when any container/app is misbehaving.
@@ -12,46 +13,123 @@ allowed-tools: Bash Read Glob Grep
# Podman Doctor — Container Infrastructure Diagnostics
Systematic diagnostic for Archipelago's Podman container stack. Catches port conflicts, network misconfigurations, health failures, missing restart policies, and config drift across all layers.
Systematic diagnostic for Archipelago's **rootless Podman** container stack. Catches port conflicts, network misconfigurations, health failures, missing restart policies, UID mapping issues, and config drift across all layers.
**SSH command**: `ssh -i ~/.ssh/archipelago-deploy archipelago@192.168.1.228`
> **ROOTLESS PODMAN**: Archipelago runs Podman as the `archipelago` user (UID 1000), NOT root.
> Never use `sudo podman` — use plain `podman` after SSH'ing in as the `archipelago` user.
> Container UIDs are mapped via subuid: container UID N → host UID (100000 + N).
If $ARGUMENTS is provided, focus diagnosis on that specific app/container. Otherwise run full audit.
## Workflow
### Step 1: Gather Runtime State
Run these on the server:
Run these on the server (as `archipelago` user — NO sudo):
```bash
# All containers with status, ports, networks
sudo podman ps -a --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}\t{{.Networks}}"
podman ps -a --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}\t{{.Networks}}"
# Check for port conflicts on known ports
sudo ss -tlnp | grep -E ":(80|443|3000|4080|5678|8080|8081|8082|8083|8085|8096|8123|8173|8174|8175|8240|8332|8333|8334|8888|9735|10009|11434|23000|50001)\b"
ss -tlnp | grep -E ":(80|443|3000|4080|5678|8080|8081|8082|8083|8085|8096|8123|8173|8174|8175|8240|8332|8333|8334|8888|9735|10009|11434|23000|50001)\b"
```
### Step 2: Check Restart Policies
### Step 2: Rootless Podman Health Check
Rootless Podman has specific requirements that must be verified:
```bash
# Verify running as archipelago user (NOT root)
whoami # Must be "archipelago"
id # Must show uid=1000(archipelago)
# Check XDG_RUNTIME_DIR is set (required for rootless podman socket)
echo "XDG_RUNTIME_DIR=$XDG_RUNTIME_DIR" # Must be /run/user/1000
# Verify subuid/subgid mapping exists
grep archipelago /etc/subuid # Must show: archipelago:100000:65536
grep archipelago /etc/subgid # Must show: archipelago:100000:65536
# Verify user lingering is enabled (keeps user services after logout)
ls /var/lib/systemd/linger/ | grep archipelago # Must exist
# Check podman storage is accessible
podman info --format "{{.Store.GraphRoot}}" # ~/.local/share/containers/storage
ls -la ~/.local/share/containers/storage/ 2>/dev/null || echo "ERROR: Storage not accessible"
# Check podman socket
ls -la /run/user/1000/podman/ 2>/dev/null || echo "WARNING: No podman socket directory"
```
### Step 3: Check Restart Policies
Every container MUST have `--restart unless-stopped`. This is the #1 cause of downtime after reboots.
```bash
for c in $(sudo podman ps -a --format "{{.Names}}"); do
for c in $(podman ps -a --format "{{.Names}}"); do
echo -n "$c: "
sudo podman inspect "$c" --format "{{.HostConfig.RestartPolicy.Name}}"
podman inspect "$c" --format "{{.HostConfig.RestartPolicy.Name}}"
done
```
**Red flag**: `no` or empty = container won't survive reboot.
### Step 3: Verify Port Mapping Consistency
### Step 4: Volume Ownership Audit (Rootless UID Mapping)
Rootless Podman maps container UIDs via subuid. Volume directories must be owned by the MAPPED UID, not the container UID. Formula: `host_uid = 100000 + container_uid`
```bash
echo "=== Volume Ownership Check ==="
# Default containers (run as root inside = UID 0 → host UID 100000)
for dir in lnd fedimint homeassistant jellyfin vaultwarden photoprism ollama filebrowser electrumx btcpay immich; do
if [ -d "/var/lib/archipelago/$dir" ]; then
owner=$(stat -c '%u:%g' "/var/lib/archipelago/$dir" 2>/dev/null)
if [ "$owner" != "100000:100000" ]; then
echo "WRONG: /var/lib/archipelago/$dir owned by $owner (should be 100000:100000)"
else
echo " OK: $dir$owner"
fi
fi
done
# Bitcoin Knots (container UID 101 → host UID 100101)
if [ -d "/var/lib/archipelago/bitcoin" ]; then
owner=$(stat -c '%u:%g' "/var/lib/archipelago/bitcoin")
[ "$owner" != "100101:100101" ] && echo "WRONG: bitcoin owned by $owner (should be 100101:100101)" || echo " OK: bitcoin → $owner"
fi
# PostgreSQL (container UID 70 → host UID 100070)
for dir in /var/lib/archipelago/*-db /var/lib/archipelago/postgres-*; do
if [ -d "$dir" ]; then
owner=$(stat -c '%u:%g' "$dir")
[ "$owner" != "100070:100070" ] && echo "WRONG: $dir owned by $owner (should be 100070:100070)" || echo " OK: $(basename $dir)$owner"
fi
done
# Grafana (container UID 472 → host UID 100472)
if [ -d "/var/lib/archipelago/grafana" ]; then
owner=$(stat -c '%u:%g' "/var/lib/archipelago/grafana")
[ "$owner" != "100472:100472" ] && echo "WRONG: grafana owned by $owner (should be 100472:100472)" || echo " OK: grafana → $owner"
fi
# MariaDB/MySQL (container UID 999 → host UID 100999)
if [ -d "/var/lib/archipelago/mysql-mempool" ]; then
owner=$(stat -c '%u:%g' "/var/lib/archipelago/mysql-mempool")
[ "$owner" != "100999:100999" ] && echo "WRONG: mysql-mempool owned by $owner (should be 100999:100999)" || echo " OK: mysql-mempool → $owner"
fi
```
### Step 5: Verify Port Mapping Consistency
Cross-reference these 4 layers — mismatches between ANY two cause "app not loading" bugs:
**Layer 1 — Backend Config (Rust)**: Read `core/archipelago/src/api/rpc/package.rs`, look at `get_app_config()` port mappings.
**Layer 2 — Podman Runtime**: `sudo podman ps --format "{{.Names}}: {{.Ports}}"`
**Layer 2 — Podman Runtime**: `podman ps --format "{{.Names}}: {{.Ports}}"`
**Layer 3 — Nginx Proxy**: Read these for `/app/{id}/` location blocks:
- `image-recipe/configs/nginx-archipelago.conf` (HTTP)
@@ -66,77 +144,114 @@ Cross-reference these 4 layers — mismatches between ANY two cause "app not loa
| Works on port but not /app/ path | Missing nginx location block |
| Frontend can't find app | PORT_TO_APP_ID missing in appLauncher.ts |
### Step 4: Network Connectivity Audit
### Step 6: Network Connectivity Audit
```bash
# Networks and their containers
sudo podman network ls
sudo podman network inspect archy-net 2>/dev/null || echo "WARNING: archy-net missing!"
podman network ls
podman network inspect archy-net 2>/dev/null || echo "WARNING: archy-net missing!"
# Check container subnet (rootless uses 10.89.x.x, NOT 10.88.x.x)
podman network inspect archy-net --format "{{range .Subnets}}{{.Subnet}}{{end}}" 2>/dev/null
```
**Must be on archy-net**: bitcoin-knots, lnd, electrs, mempool, btcpay-server, nbxplorer, fedimint, fedimint-gateway, nostr-rs-relay, indeedhub, ollama, open-webui
**Must be on archy-net**: bitcoin-knots, lnd, electrs/electrumx, mempool, btcpay-server, nbxplorer, fedimint, fedimint-gateway, nostr-rs-relay, indeedhub, ollama, open-webui
**Must NOT be on archy-net**: grafana, nextcloud, filebrowser, vaultwarden, bitcoin-ui, lnd-ui, tailscale (host network)
### Step 5: Health Check Status
### Step 7: UFW Forward Policy Check
Rootless Podman requires `DEFAULT_FORWARD_POLICY="ACCEPT"` in UFW, otherwise container ports are unreachable from LAN.
```bash
grep DEFAULT_FORWARD_POLICY /etc/default/ufw
# Must be "ACCEPT", NOT "DROP"
# If DROP: containers work locally but NOT from other machines on the network
```
### Step 8: Systemd Service Sandbox Check
The `archipelago.service` must have specific settings relaxed for rootless Podman:
```bash
# Check critical settings
systemctl cat archipelago.service | grep -E "ProtectHome|PrivateTmp|RestrictNamespaces|ReadWritePaths|XDG_RUNTIME_DIR"
```
**Required settings for rootless Podman**:
- `ProtectHome=no` — podman stores images in `~/.local/share/containers/`
- `PrivateTmp=no` or disabled — podman runtime uses `/tmp/podman-run-1000/`
- `RestrictNamespaces=` must NOT be set — rootless podman needs user namespaces
- `ReadWritePaths=` must include `/var/lib/archipelago /run/user /tmp`
- `Environment=XDG_RUNTIME_DIR=/run/user/1000`
### Step 9: Health Check Status
```bash
# Containers with health checks — are they passing?
for c in $(sudo podman ps --format "{{.Names}}"); do
health=$(sudo podman inspect "$c" --format "{{.State.Health.Status}}" 2>/dev/null)
for c in $(podman ps --format "{{.Names}}"); do
health=$(podman inspect "$c" --format "{{.State.Health.Status}}" 2>/dev/null)
if [ -n "$health" ] && [ "$health" != "<no value>" ]; then
echo "$c: $health"
fi
done
# Containers WITHOUT health checks (gap in monitoring)
for c in $(sudo podman ps --format "{{.Names}}"); do
hc=$(sudo podman inspect "$c" --format "{{.Config.Healthcheck}}" 2>/dev/null)
for c in $(podman ps --format "{{.Names}}"); do
hc=$(podman inspect "$c" --format "{{.Config.Healthcheck}}" 2>/dev/null)
if [ "$hc" = "<nil>" ] || [ -z "$hc" ]; then
echo "NO HEALTHCHECK: $c"
fi
done
```
### Step 6: Resource & Failure Analysis
### Step 10: Resource & Failure Analysis
```bash
# Resource usage
sudo podman stats --no-stream --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.MemPerc}}"
podman stats --no-stream --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.MemPerc}}"
# Recent deaths (last 24h)
sudo podman events --filter event=died --since 24h 2>/dev/null | tail -20
podman events --filter event=died --since 24h 2>/dev/null | tail -20
# OOM kills
sudo podman ps -a --format "{{.Names}}" | while read c; do
oom=$(sudo podman inspect "$c" --format "{{.State.OOMKilled}}" 2>/dev/null)
podman ps -a --format "{{.Names}}" | while read c; do
oom=$(podman inspect "$c" --format "{{.State.OOMKilled}}" 2>/dev/null)
[ "$oom" = "true" ] && echo "OOM KILLED: $c"
done
# Non-zero exits
sudo podman ps -a --filter status=exited --format "{{.Names}}\t{{.Status}}"
podman ps -a --filter status=exited --format "{{.Names}}\t{{.Status}}"
```
### Step 7: Systemd Integration
### Step 11: Systemd Integration
```bash
systemctl is-active archipelago nginx
systemctl list-units --type=service | grep -i podman
systemctl --user list-units --type=service 2>/dev/null | grep -i podman
systemctl list-timers --all | grep -i -E "podman|container|archipelago"
```
### Step 8: Generate Report
### Step 12: Generate Report
Produce a structured report:
```
## Container Diagnostic Report
### Rootless Podman Status
- User: archipelago (UID 1000)
- Subuid mapping: [OK/MISSING]
- XDG_RUNTIME_DIR: [OK/MISSING]
- User linger: [enabled/disabled]
- UFW forward policy: [ACCEPT/DROP]
### Summary
- Total containers: X running, Y stopped, Z unhealthy
- Port conflicts: [list or "none"]
- Missing restart policies: [list or "none"]
- Network issues: [list or "none"]
- UID mapping issues: [list or "none"]
- Health check gaps: [list]
### Critical Issues (fix immediately)
@@ -154,3 +269,7 @@ After diagnosis, suggest running `/podman-fix` for any issues found.
## Port Reference
See `references/port-map.md` for the canonical port assignment table across all 4 layers.
## UID Mapping Reference
See `references/uid-mapping.md` for the complete rootless UID mapping table.

View File

@@ -1,15 +1,31 @@
# Common Podman Failure Patterns
## Rootless Podman Specific Failures
| Error | Cause | Fix |
|-------|-------|-----|
| `ERRO[0000] cannot find UID/GID for user` | subuid/subgid not configured | Add `archipelago:100000:65536` to `/etc/subuid` and `/etc/subgid` |
| `Error: unshare: operation not permitted` | Systemd `RestrictNamespaces` blocks user namespaces | Remove `RestrictNamespaces=` from `archipelago.service` |
| `Error: could not get runtime: creating runtime` | XDG_RUNTIME_DIR not set or /run/user/1000 missing | Set `Environment=XDG_RUNTIME_DIR=/run/user/1000` in service, ensure `loginctl enable-linger archipelago` |
| `permission denied` on volume mount | Wrong UID ownership — must use mapped UIDs | `sudo chown -R 100000:100000 /var/lib/archipelago/APP` (see UID mapping table) |
| `ERRO[0000] rootless containers not supported` | Podman not configured for rootless | Run `podman system migrate`, check `/etc/subuid` |
| `Error: creating container storage: layer not known` | Corrupted rootless storage | `podman system reset` (destroys all containers — last resort) |
| `Error: stat /tmp/podman-run-1000/...: no such file` | PrivateTmp=yes in systemd isolates /tmp | Set `PrivateTmp=no` in `archipelago.service` |
| Container ports unreachable from LAN | UFW DEFAULT_FORWARD_POLICY="DROP" | Change to "ACCEPT" in `/etc/default/ufw`, then `sudo ufw reload` |
| `Error: error creating network namespace` | Systemd `SystemCallFilter` blocks clone/unshare | Remove `SystemCallFilter=` from `archipelago.service` |
| Containers lose network after service restart | podman runtime dir in /tmp cleaned | Ensure `PrivateTmp=no` so /tmp/podman-run-1000/ persists |
## Container Won't Start
| Error | Cause | Fix |
|-------|-------|-----|
| `exec format error` | Binary built on wrong arch | Rebuild on the Linux server |
| `address already in use` | Port conflict | `ss -tlnp \| grep :PORT` to find offender |
| `permission denied` | Missing capability or read-only root | Check `get_app_capabilities()`, add tmpfs |
| `permission denied` | Missing capability, wrong UID ownership, or read-only root | Check capabilities, check volume ownership with mapped UID, add tmpfs |
| `OCI runtime error` | Corrupt container state | `podman rm -f NAME && recreate` |
| `image not known` | Image not pulled | `podman pull IMAGE:TAG` |
| `no such network` | Network missing | `podman network create archy-net` |
| `Error: netavark: ...subnet overlap` | Network CIDR conflict | `podman network rm archy-net && podman network create archy-net` |
## Container Starts But App Unreachable
@@ -20,6 +36,7 @@
| Port mapped but refused | Container logs | App crashing internally — check logs |
| Works sometimes | Resources | Check OOM kills, CPU, disk space |
| 502 Bad Gateway | Nginx→Container | Wrong port in proxy_pass or container restarted |
| Works locally but not from LAN | UFW forward policy | Set `DEFAULT_FORWARD_POLICY="ACCEPT"` in `/etc/default/ufw` |
## Container Keeps Dying
@@ -29,6 +46,8 @@
| Dies after minutes | OOM killed | Increase `--memory` limit |
| Dies when dep restarts | No restart policy | Add `--restart unless-stopped` |
| Crash loop | Repeated crash | Fix root cause, don't just restart |
| Exit code 127 | Missing binary in container | Wrong image tag or corrupted image — re-pull |
| Exit code 137 | Killed by OOM or signal | Check `dmesg` for OOM kill, check `podman inspect` for OOMKilled |
## Network Issues
@@ -37,6 +56,20 @@
| Can't resolve container names | Not on archy-net | Recreate with `--network=archy-net` |
| Can't reach internet | DNS missing | Add `--dns 1.1.1.1` |
| Container-to-container timeout | Different networks | Put both on same network |
| Bitcoin RPC refused from container | rpcallowip wrong subnet | Use `rpcallowip=0.0.0.0/0` (safe: port mapped, not exposed) |
| Old containers can't find new network | Subnet changed (rootful→rootless) | Recreate containers on new archy-net (rootless uses 10.89.x.x) |
## Volume Permission Patterns (Rootless UID Mapping)
Formula: **host_uid = 100000 + container_uid**
| Container UID | Host UID | Apps | Data Directory |
|---|---|---|---|
| 0 (root) | 100000 | lnd, fedimint, homeassistant, jellyfin, vaultwarden, photoprism, ollama, filebrowser, electrumx, btcpay, immich | `/var/lib/archipelago/{app}` |
| 70 | 100070 | postgres (btcpay-db, immich-db, penpot-postgres) | `/var/lib/archipelago/postgres-*` |
| 101 | 100101 | bitcoin-knots | `/var/lib/archipelago/bitcoin` |
| 472 | 100472 | grafana | `/var/lib/archipelago/grafana` |
| 999 | 100999 | MariaDB (mysql-mempool) | `/var/lib/archipelago/mysql-mempool` |
## Capability Reference
@@ -47,9 +80,23 @@
| DAC_OVERRIDE | nextcloud, homeassistant, btcpay | Can't access cross-UID files |
| FOWNER | bitcoin-knots, lnd, fedimint | Can't modify data dir perms |
| NET_BIND_SERVICE | nginx-proxy-manager, vaultwarden | Can't bind ports <1024 |
| NET_ADMIN + NET_RAW | tailscale | Can't create TUN device or manage routes |
## Read-Only Safe Apps
Only these 8 apps can run with `--read-only`: searxng, grafana, filebrowser, electrs, nostr-rs-relay, ollama, indeedhub
Only these apps can run with `--read-only` + tmpfs: searxng, grafana, filebrowser, electrumx, mempool-electrs, electrs, nostr-rs-relay, ollama, indeedhub
All others need writable root or will fail silently.
## Systemd Sandbox Requirements for Rootless Podman
These systemd service settings MUST be configured for rootless Podman to work:
| Setting | Required Value | Why |
|---------|---------------|-----|
| `ProtectHome=` | `no` | Podman stores images in `~/.local/share/containers/` |
| `PrivateTmp=` | `no` | Podman runtime lives in `/tmp/podman-run-1000/` |
| `RestrictNamespaces=` | NOT SET | Rootless podman creates user namespaces |
| `SystemCallFilter=` | NOT SET | Rootless podman needs clone/unshare syscalls |
| `ReadWritePaths=` | Include `/var/lib/archipelago /run/user /tmp /etc/containers /var/lib/containers /run/containers` | Volume data + podman runtime paths |
| `Environment=` | `XDG_RUNTIME_DIR=/run/user/1000` | Podman socket location |

View File

@@ -0,0 +1,93 @@
# Rootless Podman UID Mapping Reference
## How Rootless UID Mapping Works
When Podman runs as the `archipelago` user (UID 1000), container processes don't run as their "apparent" UID on the host. Instead, Linux user namespaces remap UIDs.
**Mapping formula**: `host_uid = 100000 + container_uid`
This is configured in `/etc/subuid` and `/etc/subgid`:
```
archipelago:100000:65536
```
This means:
- Container UID 0 (root inside container) → Host UID 100000 (unprivileged on host)
- Container UID 70 (postgres) → Host UID 100070
- Container UID 101 (bitcoin) → Host UID 100101
- etc.
## Why This Matters
Volume directories (bind mounts) on the host must be owned by the **mapped** UID, not the container UID. If Bitcoin runs as UID 101 inside its container, the host directory must be owned by UID 100101.
If ownership is wrong, the container gets `permission denied` when trying to read/write its data.
## Complete UID Mapping Table
| Container UID | Host UID | Containers | Fix Command |
|---|---|---|---|
| 0 (root) | 100000 | lnd, fedimint, fedimint-gateway, homeassistant, jellyfin, vaultwarden, photoprism, ollama, filebrowser, electrumx, btcpay-server, nbxplorer, immich, nostr-rs-relay, strfry, nextcloud, searxng, onlyoffice, tailscale, uptime-kuma | `sudo chown -R 100000:100000 /var/lib/archipelago/{app}` |
| 70 | 100070 | postgres (btcpay-db, immich-db, penpot-postgres) | `sudo chown -R 100070:100070 /var/lib/archipelago/postgres-*` |
| 101 | 100101 | bitcoin-knots, bitcoin-core | `sudo chown -R 100101:100101 /var/lib/archipelago/bitcoin` |
| 472 | 100472 | grafana | `sudo chown -R 100472:100472 /var/lib/archipelago/grafana` |
| 999 | 100999 | MariaDB (mysql-mempool) | `sudo chown -R 100999:100999 /var/lib/archipelago/mysql-mempool` |
## How to Find a Container's UID
If you encounter a new container with permission issues:
```bash
# Check what user the container runs as
podman inspect CONTAINER_NAME --format "{{.Config.User}}"
# If empty, it runs as root (UID 0) → host UID 100000
# If it shows a username, find the UID inside the image
podman run --rm IMAGE_NAME id
# Then calculate: host_uid = 100000 + container_uid
```
## Fix Script
Run this after any fresh install, migration, or when containers have permission errors:
```bash
#!/bin/bash
# Fix all rootless podman volume ownership
# UID 0 → 100000 (most containers)
for dir in lnd fedimint fedimint-gateway homeassistant jellyfin vaultwarden photoprism \
ollama filebrowser electrumx btcpay nbxplorer immich nostr-rs-relay nextcloud \
searxng onlyoffice uptime-kuma; do
[ -d "/var/lib/archipelago/$dir" ] && sudo chown -R 100000:100000 "/var/lib/archipelago/$dir"
done
# UID 101 → 100101 (Bitcoin)
[ -d "/var/lib/archipelago/bitcoin" ] && sudo chown -R 100101:100101 /var/lib/archipelago/bitcoin
# UID 70 → 100070 (PostgreSQL)
for dir in /var/lib/archipelago/postgres-* /var/lib/archipelago/btcpay-db /var/lib/archipelago/immich-db; do
[ -d "$dir" ] && sudo chown -R 100070:100070 "$dir"
done
# UID 999 → 100999 (MariaDB)
[ -d "/var/lib/archipelago/mysql-mempool" ] && sudo chown -R 100999:100999 /var/lib/archipelago/mysql-mempool
# UID 472 → 100472 (Grafana)
[ -d "/var/lib/archipelago/grafana" ] && sudo chown -R 100472:100472 /var/lib/archipelago/grafana
```
## Rootful vs Rootless Comparison
| Aspect | Rootful (old) | Rootless (current) |
|--------|---------------|-------------------|
| Podman command | `sudo podman` | `podman` (as archipelago user) |
| Container storage | `/var/lib/containers/storage` | `~/.local/share/containers/storage` |
| Container subnet | `10.88.0.0/16` | `10.89.0.0/16` |
| Volume ownership | Container UID directly | Mapped UID (100000 + container_uid) |
| Requires root? | Yes | No (except fixing volume ownership) |
| XDG_RUNTIME_DIR | Not needed | Required: `/run/user/1000` |
| User lingering | Not needed | Required: `loginctl enable-linger` |
| Systemd restrictions | All can be enabled | Must disable: RestrictNamespaces, SystemCallFilter |

View File

@@ -2,19 +2,24 @@
name: podman-fix
description: >
Fix Podman container issues on Archipelago — restart failed containers, repair port bindings,
fix network connectivity, add missing restart policies, and resolve config drift.
fix network connectivity, add missing restart policies, fix rootless UID mapping, and resolve
config drift. Handles rootless Podman (user: archipelago, UID 1000, subuid 100000:65536).
Use when asked to "fix container", "restart app", "fix port mapping", "container not working",
"app won't start", "fix podman", "repair container", "container down", or after /podman-doctor
identifies issues to fix.
"app won't start", "fix podman", "repair container", "container down", "permission denied",
or after /podman-doctor identifies issues to fix.
allowed-tools: Bash Read Edit Write Glob Grep
---
# Podman Fix — Container Remediation
Targeted fix workflow for Podman container issues on Archipelago. Given a specific problem (from /podman-doctor or user report), diagnose the root cause and fix it.
Targeted fix workflow for **rootless Podman** container issues on Archipelago. Given a specific problem (from /podman-doctor or user report), diagnose the root cause and fix it.
**SSH command**: `ssh -i ~/.ssh/archipelago-deploy archipelago@192.168.1.228`
> **ROOTLESS PODMAN**: All `podman` commands run as the `archipelago` user — NO sudo.
> Only use `sudo` for: chown on volume directories, UFW changes, systemd service edits, nginx reload.
> Container UIDs are mapped via subuid: container UID N → host UID (100000 + N).
If $ARGUMENTS is provided, fix that specific app/issue. Otherwise ask what needs fixing.
## Fix Procedures
@@ -23,21 +28,22 @@ If $ARGUMENTS is provided, fix that specific app/issue. Otherwise ask what needs
```bash
# Check why it stopped
sudo podman logs --tail 50 CONTAINER_NAME
sudo podman inspect CONTAINER_NAME --format "{{.State.ExitCode}} {{.State.Error}}"
podman logs --tail 50 CONTAINER_NAME
podman inspect CONTAINER_NAME --format "{{.State.ExitCode}} {{.State.Error}}"
# If clean exit or crash — just restart
sudo podman start CONTAINER_NAME
podman start CONTAINER_NAME
# If corrupt state — remove and recreate
sudo podman rm -f CONTAINER_NAME
podman rm -f CONTAINER_NAME
# Then recreate using the install flow (trigger from UI or re-run creation command)
```
**If container keeps crashing**: check logs for the actual error. Common causes:
**If container keeps crashing**, check logs for the actual error. Common causes:
- Missing config file → check if volume mount has the config
- Wrong permissions → `chown -R` the data directory
- Wrong permissions → fix UID mapping (see Fix 8 below)
- Dependency not ready → start dependency first, wait, then start this container
- Exit code 127 → missing binary in container image, re-pull the image
### Fix 2: Missing Restart Policy
@@ -45,14 +51,14 @@ The most common uptime killer. Fix for ALL containers at once:
```bash
# Fix a single container
sudo podman update --restart unless-stopped CONTAINER_NAME
podman update --restart unless-stopped CONTAINER_NAME
# Fix ALL containers that have no restart policy
for c in $(sudo podman ps -a --format "{{.Names}}"); do
policy=$(sudo podman inspect "$c" --format "{{.HostConfig.RestartPolicy.Name}}")
for c in $(podman ps -a --format "{{.Names}}"); do
policy=$(podman inspect "$c" --format "{{.HostConfig.RestartPolicy.Name}}")
if [ "$policy" = "no" ] || [ -z "$policy" ]; then
echo "Fixing restart policy for: $c"
sudo podman update --restart unless-stopped "$c"
podman update --restart unless-stopped "$c"
fi
done
```
@@ -66,23 +72,24 @@ done
#### Port conflict (address already in use)
```bash
# Find what's using the port
sudo ss -tlnp | grep :PORT_NUMBER
ss -tlnp | grep :PORT_NUMBER
# If it's another container, either change one's port or stop the conflicting one
sudo podman stop CONFLICTING_CONTAINER
podman stop CONFLICTING_CONTAINER
# If it's a host process
sudo kill PID # or stop the service
# If it's a host process (e.g., system tor vs container tor)
sudo systemctl stop tor # Stop system service if container needs the port
sudo systemctl disable tor
```
#### Port not mapped (container running but port unreachable)
```bash
# Check current port mappings
sudo podman port CONTAINER_NAME
podman port CONTAINER_NAME
# Can't add ports to running container — must recreate
sudo podman stop CONTAINER_NAME
sudo podman rm CONTAINER_NAME
podman stop CONTAINER_NAME
podman rm CONTAINER_NAME
# Recreate with correct -p flags (use the Rust install flow or manual podman run)
```
@@ -124,35 +131,51 @@ Edit `neode-ui/src/stores/appLauncher.ts`:
#### Container not on archy-net (can't resolve other containers)
```bash
# Connect to archy-net without recreating
sudo podman network connect archy-net CONTAINER_NAME
podman network connect archy-net CONTAINER_NAME
# Verify
sudo podman inspect CONTAINER_NAME --format "{{.NetworkSettings.Networks}}"
podman inspect CONTAINER_NAME --format "{{.NetworkSettings.Networks}}"
```
#### archy-net doesn't exist
```bash
sudo podman network create archy-net
podman network create archy-net
# Then reconnect all containers that need it
```
#### DNS not working inside container
```bash
# Test DNS from inside container
sudo podman exec CONTAINER_NAME nslookup bitcoin-knots 2>/dev/null || \
sudo podman exec CONTAINER_NAME ping -c1 bitcoin-knots
podman exec CONTAINER_NAME nslookup bitcoin-knots 2>/dev/null || \
podman exec CONTAINER_NAME ping -c1 bitcoin-knots
# If DNS fails, check the container's resolv.conf
podman exec CONTAINER_NAME cat /etc/resolv.conf
# If DNS fails, recreate container with explicit DNS
# Add --dns 1.1.1.1 to the podman run command
```
#### Container subnet changed (rootful → rootless migration)
```bash
# Old rootful subnet: 10.88.0.0/16
# New rootless subnet: 10.89.0.0/16
# Bitcoin RPC rpcallowip must be updated if using subnet-specific allowlist
# Check current archy-net subnet
podman network inspect archy-net --format "{{range .Subnets}}{{.Subnet}}{{end}}"
# If Bitcoin RPC refuses connections from containers:
# Update bitcoin.conf rpcallowip to 0.0.0.0/0 (safe: only accessible via port mapping)
```
### Fix 5: Health Check Issues
#### Add missing health check to running container
Can't add to running container — must recreate with health check flags:
```bash
# Example for a web app
sudo podman run ... \
podman run ... \
--health-cmd "curl -f http://localhost:PORT/health || exit 1" \
--health-interval 30s \
--health-timeout 5s \
@@ -164,10 +187,10 @@ sudo podman run ... \
#### Fix unhealthy container
```bash
# See what the health check is actually running
sudo podman inspect CONTAINER_NAME --format "{{.Config.Healthcheck.Test}}"
podman inspect CONTAINER_NAME --format "{{.Config.Healthcheck.Test}}"
# Run the health check manually to see the error
sudo podman exec CONTAINER_NAME HEALTH_CHECK_COMMAND
podman exec CONTAINER_NAME HEALTH_CHECK_COMMAND
# Common fixes:
# - curl not installed in container → use wget or nc instead
@@ -179,13 +202,10 @@ sudo podman exec CONTAINER_NAME HEALTH_CHECK_COMMAND
```bash
# Check what capabilities container has
sudo podman inspect CONTAINER_NAME --format "{{.HostConfig.CapAdd}}"
podman inspect CONTAINER_NAME --format "{{.HostConfig.CapAdd}}"
# If missing required caps, must recreate with correct --cap-add flags
# Refer to the capability reference in /podman-doctor references
# Fix data directory permissions
sudo chown -R 1000:1000 /var/lib/archipelago/APP_NAME/
```
### Fix 7: Full Config Consistency Fix
@@ -199,12 +219,108 @@ When port map is inconsistent across layers, fix ALL layers:
5. **Deploy**: `./scripts/deploy-to-target.sh --live`
6. **Verify**: `curl -I http://192.168.1.228/app/APP_ID/`
### Fix 8: Rootless UID Mapping (Permission Denied on Volumes)
This is the #1 rootless-specific issue. Container UIDs are remapped by user namespaces.
**Formula**: `host_uid = 100000 + container_uid`
```bash
# Fix UID 0 containers (most apps — run as root inside, mapped to 100000 on host)
sudo chown -R 100000:100000 /var/lib/archipelago/APP_NAME
# Fix Bitcoin (container UID 101 → host UID 100101)
sudo chown -R 100101:100101 /var/lib/archipelago/bitcoin
# Fix PostgreSQL (container UID 70 → host UID 100070)
sudo chown -R 100070:100070 /var/lib/archipelago/postgres-APP_NAME
# Fix Grafana (container UID 472 → host UID 100472)
sudo chown -R 100472:100472 /var/lib/archipelago/grafana
# Fix MariaDB (container UID 999 → host UID 100999)
sudo chown -R 100999:100999 /var/lib/archipelago/mysql-mempool
```
**How to find the right UID for a new container:**
```bash
# Check what user the container image runs as
podman inspect IMAGE_NAME --format "{{.Config.User}}"
# If empty = root (UID 0) → host UID 100000
# If number → host UID = 100000 + that number
# If username → run: podman run --rm IMAGE_NAME id
```
After fixing ownership, restart the container:
```bash
podman restart CONTAINER_NAME
```
### Fix 9: UFW Forward Policy (LAN Access Broken)
If containers work locally but not from other machines on the network:
```bash
# Check current policy
grep DEFAULT_FORWARD_POLICY /etc/default/ufw
# Fix: change DROP to ACCEPT
sudo sed -i 's/DEFAULT_FORWARD_POLICY="DROP"/DEFAULT_FORWARD_POLICY="ACCEPT"/' /etc/default/ufw
sudo ufw reload
```
### Fix 10: Systemd Sandbox Too Restrictive
If the Rust backend can't scan/manage containers after a systemd update:
```bash
# Check what's blocked
sudo journalctl -u archipelago --since "10 min ago" | grep -i "denied\|permission\|namespace\|syscall"
# The archipelago.service MUST have these for rootless podman:
# ProtectHome=no
# PrivateTmp=no (or disabled)
# RestrictNamespaces= (NOT SET — don't restrict)
# SystemCallFilter= (NOT SET — don't filter)
# ReadWritePaths=/var/lib/archipelago /etc/containers /var/lib/containers /run/containers /run/user /tmp
# Environment=XDG_RUNTIME_DIR=/run/user/1000
```
Edit the service file:
```bash
sudo systemctl edit archipelago.service
# Add overrides, then:
sudo systemctl daemon-reload
sudo systemctl restart archipelago
```
### Fix 11: Stale Podman Processes
If `podman ps` hangs or is very slow:
```bash
# Kill stuck podman processes (>10 of them = something is wrong)
stuck=$(pgrep -c -f "podman ps\|podman stats" 2>/dev/null || echo 0)
if [ "$stuck" -gt 10 ]; then
pkill -f "podman ps\|podman stats"
echo "Killed $stuck stuck podman processes"
fi
# Kill orphaned conmon processes holding ports
for pid in $(pgrep conmon); do
container=$(cat /proc/$pid/cmdline 2>/dev/null | tr '\0' ' ' | grep -oP '(?<=--cid )\S+')
if [ -n "$container" ] && ! podman ps -a --format "{{.ID}}" | grep -q "${container:0:12}"; then
kill "$pid" 2>/dev/null && echo "Killed orphan conmon $pid"
fi
done
```
## After Fixing
Always verify the fix:
```bash
# Container running?
sudo podman ps --filter name=CONTAINER_NAME
podman ps --filter name=CONTAINER_NAME
# Port reachable?
curl -s -o /dev/null -w "%{http_code}" http://127.0.0.1:PORT/
@@ -213,7 +329,10 @@ curl -s -o /dev/null -w "%{http_code}" http://127.0.0.1:PORT/
curl -s -o /dev/null -w "%{http_code}" http://127.0.0.1/app/APP_ID/
# Health check passing?
sudo podman inspect CONTAINER_NAME --format "{{.State.Health.Status}}"
podman inspect CONTAINER_NAME --format "{{.State.Health.Status}}"
# Volume permissions correct? (rootless check)
podman exec CONTAINER_NAME ls -la /data/ 2>/dev/null || echo "Check container data path"
```
Run `/podman-doctor` again to confirm all issues are resolved.

View File

@@ -3,7 +3,8 @@ name: podman-uptime
description: >
Ensure 100% container uptime on Archipelago. Sets up systemd watchdog timers, verifies
restart policies, creates health check monitors, and configures auto-recovery for all
containers. Use when asked to "ensure uptime", "containers keep dying", "auto-restart",
containers. Handles rootless Podman (user: archipelago, UID 1000, subuid 100000:65536).
Use when asked to "ensure uptime", "containers keep dying", "auto-restart",
"watchdog", "container monitoring", "uptime guarantee", "keep containers running",
"survive reboot", or to harden container reliability.
allowed-tools: Bash Read Edit Write Glob Grep
@@ -15,6 +16,31 @@ Ensures every Archipelago container survives reboots, recovers from crashes, and
**SSH command**: `ssh -i ~/.ssh/archipelago-deploy archipelago@192.168.1.228`
> **ROOTLESS PODMAN**: All `podman` commands run as the `archipelago` user — NO sudo.
> Only use `sudo` for: systemd unit files, chown on volumes, UFW changes.
> The archipelago user runs containers directly via user namespaces.
## Prerequisites for Rootless Uptime
Before setting up uptime infrastructure, verify rootless Podman basics are working:
```bash
# Must be the archipelago user
whoami # archipelago
# User lingering must be enabled (keeps user services running after logout)
ls /var/lib/systemd/linger/ | grep archipelago || sudo loginctl enable-linger archipelago
# XDG_RUNTIME_DIR must be set
echo $XDG_RUNTIME_DIR # /run/user/1000
# Subuid/subgid must be configured
grep archipelago /etc/subuid # archipelago:100000:65536
# UFW forward policy must be ACCEPT (for LAN access to containers)
grep DEFAULT_FORWARD_POLICY /etc/default/ufw # Must be "ACCEPT"
```
## Layer 1: Restart Policies (Survive Reboots)
Every container MUST have `--restart unless-stopped`. This is non-negotiable.
@@ -23,28 +49,31 @@ Every container MUST have `--restart unless-stopped`. This is non-negotiable.
```bash
# Audit
for c in $(sudo podman ps -a --format "{{.Names}}"); do
policy=$(sudo podman inspect "$c" --format "{{.HostConfig.RestartPolicy.Name}}")
for c in $(podman ps -a --format "{{.Names}}"); do
policy=$(podman inspect "$c" --format "{{.HostConfig.RestartPolicy.Name}}")
echo "$c: $policy"
done
# Fix any with "no" or empty policy
for c in $(sudo podman ps -a --format "{{.Names}}"); do
policy=$(sudo podman inspect "$c" --format "{{.HostConfig.RestartPolicy.Name}}")
for c in $(podman ps -a --format "{{.Names}}"); do
policy=$(podman inspect "$c" --format "{{.HostConfig.RestartPolicy.Name}}")
if [ "$policy" = "no" ] || [ -z "$policy" ]; then
echo "Fixing: $c"
sudo podman update --restart unless-stopped "$c"
podman update --restart unless-stopped "$c"
fi
done
```
### Ensure podman auto-starts containers on boot
```bash
# Enable podman-restart service (restarts containers with restart policy on boot)
sudo systemctl enable podman-restart.service 2>/dev/null || true
For rootless Podman, containers with restart policies are auto-started by `podman-restart` as a **user** service:
# If podman-restart doesn't exist, create it
```bash
# Enable the rootless podman-restart user service
systemctl --user enable podman-restart.service 2>/dev/null
# If the user service doesn't exist, create a system-level one
# (runs as archipelago user via User= directive)
cat <<'EOF' | sudo tee /etc/systemd/system/podman-restart.service
[Unit]
Description=Podman Start All Containers With Restart Policy
@@ -53,8 +82,12 @@ Wants=network-online.target
[Service]
Type=oneshot
User=archipelago
Group=archipelago
Environment=XDG_RUNTIME_DIR=/run/user/1000
ExecStart=/usr/bin/podman start --all --filter restart-policy=unless-stopped
RemainAfterExit=yes
TimeoutStartSec=300
[Install]
WantedBy=multi-user.target
@@ -73,27 +106,31 @@ Create a systemd timer that checks container health every 2 minutes and restarts
```bash
cat <<'SCRIPT' | sudo tee /usr/local/bin/archipelago-container-watchdog.sh
#!/bin/bash
# Archipelago Container Watchdog
# Checks all containers and restarts any that are stopped or unhealthy
# Archipelago Container Watchdog (Rootless Podman)
# Runs as archipelago user — NO sudo for podman commands
LOG_TAG="container-watchdog"
# Run podman as the archipelago user with correct XDG path
export XDG_RUNTIME_DIR=/run/user/1000
PODMAN="/usr/bin/podman"
# Restart any stopped containers that should be running (have restart policy)
for c in $(sudo podman ps -a --filter status=exited --filter restart-policy=unless-stopped --format "{{.Names}}"); do
for c in $($PODMAN ps -a --filter status=exited --filter restart-policy=unless-stopped --format "{{.Names}}" 2>/dev/null); do
logger -t "$LOG_TAG" "Restarting stopped container: $c"
sudo podman start "$c" 2>&1 | logger -t "$LOG_TAG"
$PODMAN start "$c" 2>&1 | logger -t "$LOG_TAG"
done
# Restart unhealthy containers
for c in $(sudo podman ps --filter health=unhealthy --format "{{.Names}}"); do
for c in $($PODMAN ps --filter health=unhealthy --format "{{.Names}}" 2>/dev/null); do
logger -t "$LOG_TAG" "Restarting unhealthy container: $c"
sudo podman restart "$c" 2>&1 | logger -t "$LOG_TAG"
$PODMAN restart "$c" 2>&1 | logger -t "$LOG_TAG"
done
# Check for containers in "created" state (never started)
for c in $(sudo podman ps -a --filter status=created --format "{{.Names}}"); do
for c in $($PODMAN ps -a --filter status=created --format "{{.Names}}" 2>/dev/null); do
logger -t "$LOG_TAG" "Starting created container: $c"
sudo podman start "$c" 2>&1 | logger -t "$LOG_TAG"
$PODMAN start "$c" 2>&1 | logger -t "$LOG_TAG"
done
SCRIPT
@@ -103,7 +140,7 @@ sudo chmod +x /usr/local/bin/archipelago-container-watchdog.sh
### Create the systemd timer
```bash
# Service unit
# Service unit — runs as archipelago user for rootless podman
cat <<'EOF' | sudo tee /etc/systemd/system/archipelago-watchdog.service
[Unit]
Description=Archipelago Container Watchdog
@@ -111,6 +148,9 @@ After=podman-restart.service
[Service]
Type=oneshot
User=archipelago
Group=archipelago
Environment=XDG_RUNTIME_DIR=/run/user/1000
ExecStart=/usr/local/bin/archipelago-container-watchdog.sh
EOF
@@ -150,17 +190,20 @@ Some containers depend on others. The watchdog handles restarts, but initial boo
```bash
cat <<'SCRIPT' | sudo tee /usr/local/bin/archipelago-ordered-start.sh
#!/bin/bash
# Ordered container startup for Archipelago
# Ordered container startup for Archipelago (Rootless Podman)
# Runs as archipelago user — NO sudo for podman commands
# Respects dependency chain: bitcoin → electrs/lnd → mempool/btcpay
LOG_TAG="ordered-start"
export XDG_RUNTIME_DIR=/run/user/1000
PODMAN="/usr/bin/podman"
wait_for_container() {
local name=$1
local max_wait=${2:-60}
local waited=0
while [ $waited -lt $max_wait ]; do
status=$(sudo podman inspect "$name" --format "{{.State.Running}}" 2>/dev/null)
status=$($PODMAN inspect "$name" --format "{{.State.Running}}" 2>/dev/null)
if [ "$status" = "true" ]; then
logger -t "$LOG_TAG" "$name is running"
return 0
@@ -174,38 +217,45 @@ wait_for_container() {
# Tier 0: Infrastructure
logger -t "$LOG_TAG" "Starting Tier 0: Infrastructure"
sudo podman start tailscale 2>/dev/null
$PODMAN start tailscale 2>/dev/null
# Tier 1: Bitcoin (foundation)
logger -t "$LOG_TAG" "Starting Tier 1: Bitcoin"
sudo podman start bitcoin-knots 2>/dev/null
# Tier 1: Databases (must start before services that depend on them)
logger -t "$LOG_TAG" "Starting Tier 1: Databases"
$PODMAN start mempool-db 2>/dev/null
$PODMAN start btcpay-postgres 2>/dev/null
$PODMAN start immich_postgres 2>/dev/null
sleep 5
# Tier 2: Bitcoin (foundation for Lightning and explorers)
logger -t "$LOG_TAG" "Starting Tier 2: Bitcoin"
$PODMAN start bitcoin-knots 2>/dev/null
wait_for_container bitcoin-knots 120
# Tier 2: Bitcoin-dependent services
logger -t "$LOG_TAG" "Starting Tier 2: Bitcoin-dependent"
sudo podman start electrs 2>/dev/null
sudo podman start lnd 2>/dev/null
wait_for_container electrs 90
# Tier 3: Bitcoin-dependent services
logger -t "$LOG_TAG" "Starting Tier 3: Bitcoin-dependent"
$PODMAN start electrumx 2>/dev/null
$PODMAN start lnd 2>/dev/null
wait_for_container electrumx 90
wait_for_container lnd 90
# Tier 3: Services depending on Tier 2
logger -t "$LOG_TAG" "Starting Tier 3: Second-order dependencies"
sudo podman start mempool-db 2>/dev/null
sleep 5
sudo podman start mempool 2>/dev/null
sudo podman start nbxplorer 2>/dev/null
# Tier 4: Services depending on Tier 3
logger -t "$LOG_TAG" "Starting Tier 4: Second-order dependencies"
$PODMAN start mempool 2>/dev/null
$PODMAN start nbxplorer 2>/dev/null
sleep 10
sudo podman start btcpay-server 2>/dev/null
sudo podman start btcpay-postgres 2>/dev/null
$PODMAN start btcpay-server 2>/dev/null
$PODMAN start fedimint 2>/dev/null
$PODMAN start fedimint-gateway 2>/dev/null
# Tier 4: Independent apps (start all remaining)
logger -t "$LOG_TAG" "Starting Tier 4: Independent apps"
sudo podman start --all 2>/dev/null
# Tier 5: Independent apps (start all remaining)
logger -t "$LOG_TAG" "Starting Tier 5: Independent apps"
$PODMAN start --all 2>/dev/null
# Tier 5: UI containers (need parent apps running first)
logger -t "$LOG_TAG" "Starting Tier 5: UI containers"
sudo podman start bitcoin-ui 2>/dev/null
sudo podman start lnd-ui 2>/dev/null
# Tier 6: UI containers (need parent apps running first)
logger -t "$LOG_TAG" "Starting Tier 6: UI containers"
$PODMAN start bitcoin-ui 2>/dev/null
$PODMAN start lnd-ui 2>/dev/null
$PODMAN start electrs-ui 2>/dev/null
logger -t "$LOG_TAG" "Startup sequence complete"
SCRIPT
@@ -216,18 +266,22 @@ sudo chmod +x /usr/local/bin/archipelago-ordered-start.sh
### Wire into boot sequence
```bash
# Runs as archipelago user for rootless podman
cat <<'EOF' | sudo tee /etc/systemd/system/archipelago-containers.service
[Unit]
Description=Archipelago Ordered Container Startup
After=network-online.target podman.service
After=network-online.target
Wants=network-online.target
Before=archipelago.service
[Service]
Type=oneshot
User=archipelago
Group=archipelago
Environment=XDG_RUNTIME_DIR=/run/user/1000
ExecStart=/usr/local/bin/archipelago-ordered-start.sh
RemainAfterExit=yes
TimeoutStartSec=300
TimeoutStartSec=600
[Install]
WantedBy=multi-user.target
@@ -237,14 +291,45 @@ sudo systemctl daemon-reload
sudo systemctl enable archipelago-containers.service
```
## Rootless-Specific Uptime Considerations
### Volume ownership survives reboots
Volume ownership doesn't change on reboot, but if a container image is updated (re-pulled), the new container may run as a different UID. Always verify after image updates:
```bash
# Quick ownership audit after image pull
podman inspect CONTAINER_NAME --format "{{.Config.User}}"
# Then verify: sudo stat -c '%u:%g' /var/lib/archipelago/APP_NAME
# Formula: host_uid = 100000 + container_uid
```
### XDG_RUNTIME_DIR on boot
Rootless Podman requires `/run/user/1000` to exist. This is created by `pam_systemd` when the user logs in, or by `loginctl enable-linger`. If it's missing after boot, containers won't start.
```bash
# Verify it exists
ls -la /run/user/1000/ || echo "CRITICAL: /run/user/1000 missing — run: sudo loginctl enable-linger archipelago"
```
### Systemd sandbox must not block podman
If the archipelago.service sandbox blocks namespace/syscall operations, the Rust backend can't scan containers. See Fix 10 in /podman-fix.
## Verification Checklist
After setting up all 3 layers, verify:
```bash
echo "=== Rootless Podman Prerequisites ==="
echo "User: $(whoami)"
echo "XDG_RUNTIME_DIR: $XDG_RUNTIME_DIR"
grep archipelago /etc/subuid | head -1
ls /var/lib/systemd/linger/ | grep archipelago && echo "Linger: enabled" || echo "Linger: DISABLED"
grep DEFAULT_FORWARD_POLICY /etc/default/ufw
echo ""
echo "=== Layer 1: Restart Policies ==="
for c in $(sudo podman ps -a --format "{{.Names}}"); do
policy=$(sudo podman inspect "$c" --format "{{.HostConfig.RestartPolicy.Name}}")
for c in $(podman ps -a --format "{{.Names}}"); do
policy=$(podman inspect "$c" --format "{{.HostConfig.RestartPolicy.Name}}")
echo " $c: $policy"
done
@@ -261,11 +346,19 @@ sudo systemctl is-enabled archipelago-watchdog.timer 2>/dev/null || echo "watchd
echo ""
echo "=== Container Health Summary ==="
total=$(sudo podman ps -a --format "{{.Names}}" | wc -l)
running=$(sudo podman ps --format "{{.Names}}" | wc -l)
total=$(podman ps -a --format "{{.Names}}" | wc -l)
running=$(podman ps --format "{{.Names}}" | wc -l)
stopped=$((total - running))
unhealthy=$(sudo podman ps --filter health=unhealthy --format "{{.Names}}" | wc -l)
unhealthy=$(podman ps --filter health=unhealthy --format "{{.Names}}" | wc -l)
echo " Total: $total | Running: $running | Stopped: $stopped | Unhealthy: $unhealthy"
echo ""
echo "=== Volume Ownership Spot Check ==="
for dir in bitcoin lnd grafana; do
if [ -d "/var/lib/archipelago/$dir" ]; then
echo " $dir: $(stat -c '%u:%g' /var/lib/archipelago/$dir)"
fi
done
```
## Reboot Test
@@ -274,17 +367,20 @@ The ultimate uptime test — reboot the server and verify everything comes back:
```bash
# Before reboot: record running containers
sudo podman ps --format "{{.Names}}" | sort > /tmp/before-reboot.txt
podman ps --format "{{.Names}}" | sort > /tmp/before-reboot.txt
# Reboot
sudo reboot
# After reboot (wait ~3 minutes, then SSH back in):
sudo podman ps --format "{{.Names}}" | sort > /tmp/after-reboot.txt
podman ps --format "{{.Names}}" | sort > /tmp/after-reboot.txt
# Compare
diff /tmp/before-reboot.txt /tmp/after-reboot.txt
# Should show no differences
# Also verify XDG_RUNTIME_DIR survived reboot
ls /run/user/1000/ || echo "CRITICAL: lingering not working"
```
## Monitoring
@@ -292,18 +388,23 @@ diff /tmp/before-reboot.txt /tmp/after-reboot.txt
Check uptime status anytime:
```bash
# Quick status
sudo podman ps -a --format "table {{.Names}}\t{{.Status}}" | sort
podman ps -a --format "table {{.Names}}\t{{.Status}}" | sort
# Watchdog activity
sudo journalctl -t container-watchdog --since "24 hours ago" --no-pager
# Container events (starts, stops, deaths)
sudo podman events --since 24h --filter event=start --filter event=stop --filter event=died 2>/dev/null | tail -30
podman events --since 24h --filter event=start --filter event=stop --filter event=died 2>/dev/null | tail -30
# Check for permission denied errors (rootless UID mapping issue)
podman ps -a --filter status=exited --format "{{.Names}}" | while read c; do
podman logs --tail 5 "$c" 2>&1 | grep -i "permission denied" && echo " ^ UID mapping issue in: $c"
done
```
## Integration
- Run `/podman-doctor` first to identify issues
- Run `/podman-fix` for specific container repairs
- Run `/podman-doctor` first to identify issues (includes rootless health checks)
- Run `/podman-fix` for specific container repairs (includes UID mapping fixes)
- Run `/podman-uptime` to set up permanent reliability infrastructure
- Add to ISO build: copy watchdog scripts to `image-recipe/configs/` and enable in first-boot

View File

@@ -621,7 +621,7 @@ fn convert_state(container_state: &ContainerState) -> (PackageState, ServiceStat
ContainerState::Stopped | ContainerState::Exited => {
(PackageState::Stopped, ServiceStatus::Stopped)
}
ContainerState::Created => (PackageState::Starting, ServiceStatus::Starting),
ContainerState::Created => (PackageState::Stopped, ServiceStatus::Stopped),
ContainerState::Paused => (PackageState::Stopped, ServiceStatus::Stopped),
ContainerState::Unknown(_) => (PackageState::Stopped, ServiceStatus::Stopped),
}

View File

@@ -13,7 +13,7 @@ server {
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Authorization "Basic YXJjaGlwZWxhZ286YXJjaGlwZWxhZ28xMjM=";
proxy_set_header Authorization "Basic __BITCOIN_RPC_AUTH__";
add_header Access-Control-Allow-Origin *;
add_header Access-Control-Allow-Methods "POST, GET, OPTIONS";
add_header Access-Control-Allow-Headers "Content-Type, Authorization";

File diff suppressed because it is too large Load Diff

View File

@@ -82,7 +82,7 @@ define(['./workbox-21a80088'], (function (workbox) { 'use strict';
"revision": "3ca0b8505b4bec776b69afdba2768812"
}, {
"url": "index.html",
"revision": "0.a4nevj6csc4"
"revision": "0.2lte02eatlc"
}], {});
workbox.cleanupOutdatedCaches();
workbox.registerRoute(new workbox.NavigationRoute(workbox.createHandlerBoundToURL("index.html"), {

View File

@@ -658,11 +658,23 @@ const filteredApps = computed(() => {
/** Marketplace app ID -> backend package keys (for "Already Installed" when first-boot/deploy created them) */
const INSTALLED_ALIASES: Record<string, string[]> = {
mempool: ['mempool-web'],
mempool: ['mempool-web', 'mempool-api', 'archy-mempool-web', 'archy-mempool-db'],
bitcoin: ['bitcoin-knots'],
btcpay: ['btcpay-server'],
immich: ['immich-server', 'immich-app', 'immich_server'],
btcpay: ['btcpay-server', 'archy-btcpay-db', 'archy-nbxplorer'],
immich: ['immich-server', 'immich-app', 'immich_server', 'immich_postgres', 'immich_redis'],
nextcloud: ['nextcloud-aio', 'nextcloud-server'],
fedimint: ['fedimint-gateway'],
electrumx: ['electrumx', 'archy-electrs-ui'],
grafana: ['grafana'],
jellyfin: ['jellyfin'],
vaultwarden: ['vaultwarden'],
searxng: ['searxng'],
homeassistant: ['homeassistant'],
photoprism: ['photoprism'],
lnd: ['lnd', 'archy-lnd-ui'],
filebrowser: ['filebrowser'],
tailscale: ['tailscale'],
ollama: ['ollama'],
}
function isInstalled(appId: string): boolean {
if (appId in installedPackages.value) return true

View File

@@ -800,6 +800,15 @@ MANIFEST_EOF
# Rebuild and recreate Bitcoin UI container (host network, port 8334 in nginx.conf)
# Host network required: bitcoin-ui proxies Bitcoin RPC at 127.0.0.1:8332
progress "Rebuilding Bitcoin UI"
# Inject real RPC credentials into bitcoin-ui nginx config before building
ssh $SSH_OPTS "$TARGET_HOST" '
SECRETS_DIR="/var/lib/archipelago/secrets"
RPC_PASS=$(sudo cat "$SECRETS_DIR/bitcoin-rpc-password" 2>/dev/null)
if [ -n "$RPC_PASS" ]; then
AUTH_B64=$(echo -n "archipelago:${RPC_PASS}" | base64)
sed -i "s|__BITCOIN_RPC_AUTH__|${AUTH_B64}|g" '"$TARGET_DIR"'/docker/bitcoin-ui/nginx.conf
fi
' 2>/dev/null || true
if ssh $SSH_OPTS "$TARGET_HOST" "cd $TARGET_DIR/docker/bitcoin-ui && (command -v podman >/dev/null 2>&1 && podman build --no-cache -t bitcoin-ui:latest . || docker build --no-cache -t bitcoin-ui:latest .)" 2>&1 | tail -12 | sed 's/^/ /'; then
echo " Recreating Bitcoin UI container (port 8334, host network)..."
ssh $SSH_OPTS "$TARGET_HOST" '