Files
archy/loop/plan.md
Dorian d1eb01799f fix: Phase 7 — key zeroization, OsRng, checked arithmetic, TOTP rate limits
- SecretsManager: raw key stored in Zeroizing<[u8; 32]>, auto-zeroed on drop
- SecretsManager: replaced thread_rng with OsRng (CSPRNG) for nonces
- Remember-me secret: derived from machine-id via SHA-256 (deterministic, no
  plaintext key storage)
- Bitcoin ecash balance: uses checked_add with u64::MAX saturation on overflow
- TOTP setup/confirm: added to EndpointRateLimiter (3 and 5 per 5min)
- AppId validation and Tor service name validation already existed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 01:00:57 +00:00

66 KiB

Overnight Plan — 2-Year Production Hardening & Security Roadmap

Goal: Take Archipelago from development prototype to production-grade, security-hardened Bitcoin Node OS. Every phase: fix → test → harden → test → verify nothing broke → move to next module → review at end. See CLAUDE.md for all project rules and conventions.

IMPORTANT — DEPLOY TO .198 ONLY: Do NOT deploy to .228, Arch 1, Arch 2, or Arch 3. The ONLY server you may SSH to or deploy to is 192.168.1.198. Use ./scripts/deploy-to-target.sh pointed at .198 for testing. The user will deploy to other nodes manually when ready.

NOTE — DEV ENVIRONMENT IS OUT OF SCOPE: SSH keys, deploy script credentials, StrictHostKeyChecking=no, dev passwords in test scripts, and password123 in dev mode are intentional development tooling on a private home LAN. Do NOT change these. This plan covers PRODUCTION code only — what runs on the deployed server.


============================================================

YEAR 1 — QUARTER 1: CRITICAL & HIGH SEVERITY FIXES

============================================================


Phase 1: Infrastructure — CRITICAL Production Credential Hardening

Layman version: Every Archipelago installation currently uses the same passwords (like every house in a neighborhood using the same door key). We fix this by generating unique random passwords per installation and storing them encrypted. This is the single most important security fix.

  • Generate random Bitcoin RPC credentials at first boot: In scripts/first-boot-containers.sh, find all occurrences of -rpcuser=archipelago and -rpcpassword=archipelago123. Replace the hardcoded values with dynamically generated credentials:

    1. At the top of the script (after the shebang and initial variables), add:
      # Generate per-installation credentials if not already saved
      SECRETS_DIR="/var/lib/archipelago/secrets"
      mkdir -p "$SECRETS_DIR" && chmod 700 "$SECRETS_DIR"
      if [ ! -f "$SECRETS_DIR/bitcoin-rpc-password" ]; then
          openssl rand -base64 24 > "$SECRETS_DIR/bitcoin-rpc-password"
          chmod 600 "$SECRETS_DIR/bitcoin-rpc-password"
      fi
      BITCOIN_RPC_USER="archipelago"
      BITCOIN_RPC_PASS=$(cat "$SECRETS_DIR/bitcoin-rpc-password")
      
    2. Replace every -rpcpassword=archipelago123 with -rpcpassword=$BITCOIN_RPC_PASS throughout the script.
    3. Replace every archipelago:archipelago123@ in connection strings (ElectrumX DAEMON_URL, etc.) with $BITCOIN_RPC_USER:$BITCOIN_RPC_PASS@.
    4. Do the same in scripts/deploy-to-target.sh — search for archipelago123 and replace with $BITCOIN_RPC_PASS (read from the same secrets file on the target server).
    5. Run grep -rn "archipelago123" scripts/ to verify no hardcoded passwords remain in scripts.
    6. Local verify: code changes only — user will deploy and test on server manually.
  • Generate random database passwords at first boot: Same pattern for all database passwords. In scripts/first-boot-containers.sh:

    1. Add credential generation for each database service:
      for svc in mempool btcpay immich penpot; do
          if [ ! -f "$SECRETS_DIR/${svc}-db-password" ]; then
              openssl rand -base64 24 > "$SECRETS_DIR/${svc}-db-password"
              chmod 600 "$SECRETS_DIR/${svc}-db-password"
          fi
      done
      MEMPOOL_DB_PASS=$(cat "$SECRETS_DIR/mempool-db-password")
      BTCPAY_DB_PASS=$(cat "$SECRETS_DIR/btcpay-db-password")
      IMMICH_DB_PASS=$(cat "$SECRETS_DIR/immich-db-password")
      PENPOT_DB_PASS=$(cat "$SECRETS_DIR/penpot-db-password")
      
    2. Replace mempoolpass with $MEMPOOL_DB_PASS, btcpaypass with $BTCPAY_DB_PASS, immichpass with $IMMICH_DB_PASS, penpot (password) with $PENPOT_DB_PASS throughout the script.
    3. Replace rootpass (MySQL root) with a generated password too.
    4. On the live server, update existing containers: stop each DB container, update the password in the DB itself, restart with new env vars.
    5. Verify each service still connects to its database by checking container logs for connection errors.
  • Generate unique Fedimint gateway password per deployment: In scripts/first-boot-containers.sh and scripts/deploy-to-target.sh, find the hardcoded bcrypt hash $2y$10$t9YjjxkiktrlYvjajB/zgOMDnSNVg4HqrbDqh47u7Jf42whNdxNqC. Replace with:

    1. Generate a random password and hash it:
      if [ ! -f "$SECRETS_DIR/fedimint-gateway-password" ]; then
          FEDI_PASS=$(openssl rand -base64 16)
          echo "$FEDI_PASS" > "$SECRETS_DIR/fedimint-gateway-password"
          chmod 600 "$SECRETS_DIR/fedimint-gateway-password"
      fi
      FEDI_PASS=$(cat "$SECRETS_DIR/fedimint-gateway-password")
      FEDI_HASH=$(htpasswd -bnBC 10 "" "$FEDI_PASS" | tr -d ':\n')
      
    2. Use $FEDI_HASH in the --bcrypt-password-hash argument.
    3. Display the password in the first-boot log so the operator can note it.
    4. Verify: open Fedimint gateway web UI and log in with the generated password.
  • Remove hardcoded Bitcoin RPC credentials from Rust backend: In core/archipelago/src/mesh/mod.rs, find line ~610 with .basic_auth("archipelago", Some("archipelago123")). Replace with:

    1. Add a function to read credentials from the secrets file:
      fn read_bitcoin_rpc_credentials() -> Result<(String, String)> {
          let pass = tokio::fs::read_to_string("/var/lib/archipelago/secrets/bitcoin-rpc-password")
              .await
              .context("Failed to read Bitcoin RPC password from secrets")?;
          Ok(("archipelago".to_string(), pass.trim().to_string()))
      }
      
    2. Call this function where RPC credentials are needed instead of hardcoding.
    3. Do the same for any other .basic_auth("archipelago", Some("archipelago123")) calls in the codebase. Search with grep -rn "archipelago123" core/ to find all occurrences.
    4. Build on dev server: cd ~/archy/core && cargo clippy --all-targets --all-features.
    5. Deploy and verify mesh Bitcoin relay still works.
  • Verify Phase 1 — No hardcoded passwords remain: Run these checks:

    1. grep -rn "archipelago123" scripts/ core/ --include="*.rs" --include="*.sh" — should return zero results (except comments explaining the migration).
    2. grep -rn "mempoolpass\|btcpaypass\|immichpass\|rootpass" scripts/ --include="*.sh" — should return zero results.
    3. ls -la /var/lib/archipelago/secrets/ on the server — should show password files with 600 permissions.
    4. All services still running: sudo podman ps --format '{{.Names}} {{.Status}}' | grep -v "Up" — should show nothing (all containers Up).
    5. Bitcoin RPC works: sudo podman exec bitcoin-knots bitcoin-cli getblockchaininfo | head -5.
    6. Web UI loads and all apps accessible at http://192.168.1.198.

Phase 2: Infrastructure — Systemd & Network Hardening

Layman version: The backend currently runs as the all-powerful "root" user with no restrictions. If any bug is exploited, the attacker gets complete control of everything. We lock it down so the backend can only do what it needs to do — like giving a bank teller access to the cash drawer but not the vault, the CEO's office, or the security cameras.

  • Create unprivileged archipelago user for backend: SSH to 192.168.1.198:

    1. Check if user exists: id archipelago. If it's the login user (UID 1000), create a separate service user: sudo useradd -r -s /usr/sbin/nologin -d /var/lib/archipelago archipelago-svc (UID will be in the system range).
    2. Actually — the archipelago user already exists as UID 1000 (the login user). The backend should run as this user, NOT root. Change /etc/systemd/system/archipelago.service to use User=archipelago instead of User=root.
    3. Fix file ownership: sudo chown -R archipelago:archipelago /var/lib/archipelago/.
    4. The backend needs to talk to Podman. Since Podman is rootless for UID 1000, this should work. Test: sudo -u archipelago podman ps.
    5. If Podman needs root for some operations, use sudo with specific commands only via sudoers — NOT running the entire backend as root.
  • Add systemd sandboxing to archipelago.service: Edit image-recipe/configs/archipelago.service. Add these directives under [Service]:

    # Filesystem protection
    ProtectSystem=strict
    ProtectHome=yes
    PrivateTmp=yes
    ReadWritePaths=/var/lib/archipelago
    
    # Privilege restriction
    NoNewPrivileges=yes
    PrivateDevices=yes
    
    # Network restriction (allow only IPv4/IPv6 + Unix sockets)
    RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6
    
    # Restrict what the process can do
    RestrictNamespaces=yes
    RestrictRealtime=yes
    RestrictSUIDSGID=yes
    
    # Only allow needed syscalls
    SystemCallArchitectures=native
    SystemCallFilter=@system-service
    SystemCallFilter=~@privileged @resources
    
    # Memory protection
    MemoryDenyWriteExecute=yes
    
    # Logging
    StandardOutput=journal
    StandardError=journal
    

    Deploy the service file to the server: scp image-recipe/configs/archipelago.service archipelago@192.168.1.198:/tmp/ && ssh archipelago@192.168.1.198 'sudo cp /tmp/archipelago.service /etc/systemd/system/ && sudo systemctl daemon-reload && sudo systemctl restart archipelago'. Watch the journal for errors: ssh archipelago@192.168.1.198 'sudo journalctl -u archipelago -n 50 --no-pager'. If the service fails to start due to a denied syscall or path, adjust the sandboxing (e.g., add the path to ReadWritePaths or the syscall group to SystemCallFilter). Iterate until the service starts cleanly.

  • Bind Bitcoin RPC to localhost only: SSH to 192.168.1.198. Edit the bitcoin-knots container's start command:

    1. Find where bitcoin-knots is started (in scripts/first-boot-containers.sh or via podman inspect bitcoin-knots).
    2. Change -rpcbind=0.0.0.0:8332 to -rpcbind=127.0.0.1:8332 -rpcbind=::1:8332.
    3. Change -rpcallowip=0.0.0.0/0 to -rpcallowip=127.0.0.1/32 -rpcallowip=10.88.0.0/16 (the 10.88.x.x is Podman's default network — containers need to reach Bitcoin RPC).
    4. Stop and recreate bitcoin-knots with the new flags.
    5. Verify containers on the Podman network can still reach it: sudo podman exec lnd bitcoin-cli -rpcconnect=bitcoin-knots -rpcuser=... getblockchaininfo.
    6. Verify external access is blocked: from another machine on the LAN, curl http://192.168.1.198:8332 should fail/timeout.
  • Reduce Tailscale container privileges: In scripts/first-boot-containers.sh, find the Tailscale container creation (line ~460). Replace --privileged with:

    --cap-drop=ALL \
    --cap-add=NET_ADMIN \
    --cap-add=NET_RAW \
    --device=/dev/net/tun:/dev/net/tun \
    --read-only \
    --tmpfs /tmp \
    --tmpfs /var/lib/tailscale \
    

    Recreate the Tailscale container on the server. Verify Tailscale still works: sudo podman exec tailscale tailscale status.

  • Verify Phase 2 — Systemd hardening active: Run these checks:

    1. sudo systemctl show archipelago | grep -E "ProtectSystem|NoNewPrivileges|PrivateTmp" — should show strict, yes, yes.
    2. sudo systemctl status archipelago — should be active and running.
    3. ss -tlnp | grep 8332 — Bitcoin RPC should show 127.0.0.1:8332, NOT 0.0.0.0:8332.
    4. sudo podman inspect tailscale | jq '.[0].HostConfig.Privileged' — should be false.
    5. All apps still load in the web UI.
    6. Mesh networking still works (if enabled).

Phase 3: Backend — CRITICAL Code Fixes

Layman version: Two bugs in the Rust backend could let an attacker either run any command on your server (command injection) or crash your entire node at will (unwrap panic). These are the most dangerous code-level bugs found.

  • Fix command injection in VPN key generation: In core/archipelago/src/vpn.rs, find lines 132-137 where sh -c is used with format!("echo '{}' | wg pubkey", private_key). This is a textbook command injection vulnerability. Replace the entire block with safe stdin piping:

    let mut child = tokio::process::Command::new("wg")
        .arg("pubkey")
        .stdin(std::process::Stdio::piped())
        .stdout(std::process::Stdio::piped())
        .stderr(std::process::Stdio::piped())
        .spawn()
        .context("Failed to spawn wg pubkey")?;
    
    if let Some(mut stdin) = child.stdin.take() {
        use tokio::io::AsyncWriteExt;
        stdin.write_all(private_key.as_bytes()).await
            .context("Failed to write private key to wg stdin")?;
        // stdin is dropped here, closing it
    }
    
    let output = child.wait_with_output().await
        .context("wg pubkey process failed")?;
    
    if !output.status.success() {
        anyhow::bail!("wg pubkey failed: {}", String::from_utf8_lossy(&output.stderr));
    }
    
    let pubkey = String::from_utf8(output.stdout)
        .context("wg pubkey output is not valid UTF-8")?
        .trim()
        .to_string();
    

    Search the entire core/ directory for other sh -c or bash -c patterns: grep -rn 'Command::new("sh")\|Command::new("bash")' core/. Fix any other occurrences with the same pattern. Build: cd ~/archy/core && cargo clippy --all-targets --all-features. Test: If VPN setup is available in the UI, test generating a WireGuard key.

  • Fix unwrap crash in secrets manager: In core/security/src/secrets_manager.rs, find line 112 with secret_path.parent().unwrap(). Replace with:

    let parent = secret_path.parent()
        .ok_or_else(|| anyhow::anyhow!("Invalid secret path: no parent directory for {:?}", secret_path))?;
    fs::create_dir_all(parent).await?;
    

    Search for ALL .unwrap() calls in the file: grep -n "unwrap()" core/security/src/secrets_manager.rs. For each one in a non-test function, evaluate whether it can actually fail and replace with ? or .ok_or_else() if so. Common safe unwraps (e.g., after a .is_some() check) can stay but should get a comment explaining why they're safe. Build and deploy.

  • Fix expect crash in Tor proxy fallback: In core/archipelago/src/api/rpc/tor.rs, find line ~525 with .expect("valid proxy"). Replace the entire proxy chain with proper error handling:

    let proxy_url = format!("socks5h://{}", proxy);
    let proxy = reqwest::Proxy::all(&proxy_url)
        .or_else(|_| reqwest::Proxy::all("socks5h://127.0.0.1:9050"))
        .context("Failed to create SOCKS5 proxy for Tor")?;
    

    Search for ALL .expect( calls in non-test code: grep -rn "\.expect(" core/archipelago/src/ --include="*.rs" | grep -v "#\[cfg(test)\]" | grep -v "mod tests". List them and fix any that could realistically fail in production. Build: cargo clippy --all-targets --all-features.

  • Fix image verifier accepting unsigned images: In core/security/src/image_verifier.rs, find lines 18-22 where the verifier returns Ok(false) for unsigned images. Change to:

    if signature.is_none() && self.cosign_public_key.is_none() {
        return Err(anyhow::anyhow!(
            "Image '{}' has no signature and no cosign key is configured. \
             All container images must be signed for production use.",
            image
        ));
    }
    

    Also fix line 25-32 where missing cosign binary returns Ok(false):

    if !cosign_available {
        return Err(anyhow::anyhow!(
            "Cosign binary not found. Install cosign to verify container image signatures."
        ));
    }
    

    Build and test. Note: this may cause existing unsigned images to fail verification. If the system doesn't use cosign yet, add a config flag require_signatures: bool that defaults to false for now but can be flipped to true when cosign is deployed.

  • Verify Phase 3 — No more crash vectors: Run these checks:

    1. grep -rn 'Command::new("sh")' core/ --include="*.rs" — should return zero results.
    2. grep -rn "\.unwrap()" core/security/src/secrets_manager.rs | grep -v test — should be minimal/commented.
    3. grep -rn "\.expect(" core/archipelago/src/api/ --include="*.rs" | grep -v test | grep -v "// SAFE:" — review each remaining expect.
    4. cargo clippy --all-targets --all-features — zero warnings.
    5. Backend starts cleanly: sudo systemctl restart archipelago && sudo journalctl -u archipelago -n 20 --no-pager.
    6. Web UI login works. Container start/stop works. Settings page works.

Phase 4: Mesh Networking — Authentication & Validation

Layman version: The mesh network currently accepts messages from anyone who claims to be someone. It's like accepting a phone call from someone who says "Hi, I'm your bank" without verifying. We add cryptographic proof of identity (digital signatures) so every message is provably from who it claims. We also add checks so fake Bitcoin data can't be relayed.

  • Implement signed identity announcements: In core/archipelago/src/mesh/listener.rs, find the identity advertisement handling (around line 923+). Modify the peer identity broadcast to include an Ed25519 signature:

    1. When broadcasting identity (DID + Ed25519 pubkey), sign the announcement with the node's private key:
      // In the identity broadcast function
      let identity_payload = format!("{}:{}", did, hex::encode(&pubkey));
      let signature = signing_key.sign(identity_payload.as_bytes());
      // Include signature in the broadcast envelope
      
    2. When receiving an identity announcement, verify the signature before accepting the peer:
      // In the identity receive handler
      let identity_payload = format!("{}:{}", claimed_did, hex::encode(&claimed_pubkey));
      let verifying_key = ed25519_dalek::VerifyingKey::from_bytes(&claimed_pubkey)?;
      verifying_key.verify_strict(identity_payload.as_bytes(), &signature)
          .map_err(|_| anyhow::anyhow!("Identity announcement signature verification failed for {}", claimed_did))?;
      
    3. Reject any identity announcement without a valid signature. Log the rejection at warn! level.
    4. Update the TypedEnvelope struct in message_types.rs to include an optional identity_signature field if not already present. Build and test with two mesh-connected nodes if available. If only one node, verify the code compiles and the identity broadcast includes signatures.
  • Verify envelope signatures on received messages: In core/archipelago/src/mesh/listener.rs, find where incoming TypedEnvelope messages are processed. Add signature verification:

    1. Before processing any message, call envelope.verify_signature() (which should already exist in message_types.rs).
    2. If verification fails, log a warning and drop the message:
      if !envelope.verify_signature(&peer_pubkey)? {
          tracing::warn!(peer = %contact_id, "Dropping message with invalid signature");
          continue;
      }
      
    3. For alert messages specifically, verify the alert is signed by the claimed peer's key before displaying or relaying. Build and deploy.
  • Add Bitcoin transaction/block validation before relay: In core/archipelago/src/mesh/bitcoin_relay.rs, find lines 210-232 where block headers and transactions are relayed:

    1. For block headers, add basic validation:
      fn validate_block_header(header: &BlockHeader, last_known_height: u32) -> Result<bool> {
          // Check header version is valid (1-4 or BIP9 signaling)
          if header.version < 1 {
              return Ok(false);
          }
          // Check that height is sequential (within reason for mesh delays)
          if header.height > last_known_height + 100 {
              tracing::warn!("Block header height {} is too far ahead of known height {}", header.height, last_known_height);
              return Ok(false);
          }
          // Check prev_block_hash is 32 bytes
          if header.prev_block_hash.len() != 32 {
              return Ok(false);
          }
          Ok(true)
      }
      
    2. For transactions, add basic syntax validation:
      fn validate_raw_transaction(tx_bytes: &[u8]) -> Result<bool> {
          // Minimum valid transaction size is ~60 bytes
          if tx_bytes.len() < 60 || tx_bytes.len() > 400_000 {
              return Ok(false);
          }
          // Check version bytes (first 4 bytes, little-endian)
          let version = u32::from_le_bytes(tx_bytes[0..4].try_into()?);
          if version < 1 || version > 3 {
              return Ok(false);
          }
          Ok(true)
      }
      
    3. Add rate limiting: max 10 block headers per minute, max 5 transactions per minute per peer.
    4. Call these validation functions before relaying any data. Build and deploy.
  • Add message sequence numbers: In core/archipelago/src/mesh/message_types.rs, add a sequence: u64 field to TypedEnvelope:

    1. Add the field to the struct (with #[serde(default)] for backwards compatibility with old messages).
    2. In the message creation code, increment a per-peer counter for each outgoing message.
    3. On receive, track the last seen sequence per peer and log out-of-order messages at debug! level.
    4. Do NOT reject out-of-order messages (mesh is unreliable), but allow upper layers to reorder if needed. Build and deploy.
  • Verify Phase 4 — Mesh authentication active: Run these checks:

    1. grep -rn "verify_signature\|verify_strict" core/archipelago/src/mesh/ --include="*.rs" — should show verification calls in listener.rs and message_types.rs.
    2. grep -rn "validate_block_header\|validate_raw_transaction" core/archipelago/src/mesh/bitcoin_relay.rs — validation functions exist.
    3. cargo test --all-features — all mesh tests pass.
    4. cargo clippy --all-targets --all-features — zero warnings.
    5. Backend starts cleanly with mesh enabled.

============================================================

YEAR 1 — QUARTER 2: FRONTEND, NGINX, AND MEDIUM FIXES

============================================================


Phase 5: Frontend — XSS, Auth, and Input Validation

Layman version: The web interface has a few places where an attacker could inject malicious code into the page (XSS), steal login cookies, or redirect you to a fake site after login. We fix all of these and add proper input sanitization everywhere.

  • Fix v-html XSS in BootScreen and Settings: In neode-ui/src/components/BootScreen.vue line 55, replace v-html="icons[currentIcon]" with a safe rendering approach:

    1. Since the icons are hardcoded SVG strings, create a computed property that returns the current icon and use v-html with a DOMPurify sanitizer.
    2. Install DOMPurify: cd neode-ui && npm install dompurify && npm install -D @types/dompurify.
    3. Verify the package exists first: npm view dompurify version.
    4. In BootScreen.vue:
      import DOMPurify from 'dompurify'
      const sanitizedIcon = computed(() => DOMPurify.sanitize(icons[currentIcon.value], { USE_PROFILES: { svg: true } }))
      
      Then use v-html="sanitizedIcon".
    5. In Settings.vue line 286, do the same for totpQrSvg:
      const sanitizedQrSvg = computed(() => DOMPurify.sanitize(totpQrSvg.value, { USE_PROFILES: { svg: true } }))
      
    6. Run npm run type-check to verify.
    7. Build and deploy. Verify boot screen animation still works. Verify TOTP QR code still renders on Settings page.
  • Fix FileBrowser cookie security flags: In neode-ui/src/api/filebrowser-client.ts line 62, find document.cookie = \auth=${this.token}; path=/app/filebrowser; SameSite=Strict`. This cookie is missing security flags. Since we can't set HttpOnly` from JavaScript (that's a server-side flag), the best we can do client-side is:

    document.cookie = `auth=${this.token}; path=/app/filebrowser; SameSite=Strict; Secure`
    

    The Secure flag ensures the cookie is only sent over HTTPS. For the long term (Phase 13), the FileBrowser auth should be proxied through the backend so the cookie can be set server-side with HttpOnly. Also add an expiration so the cookie doesn't persist indefinitely:

    const expires = new Date(Date.now() + 24 * 60 * 60 * 1000).toUTCString() // 24 hours
    document.cookie = `auth=${this.token}; path=/app/filebrowser; SameSite=Strict; Secure; expires=${expires}`
    

    Build and deploy. Verify FileBrowser still works (login, browse, download).

  • Hide TOTP secret by default: In neode-ui/src/views/Settings.vue, find line 289 with {{ totpSecretBase32 }}. Wrap it in a reveal toggle:

    1. Add a ref: const showTotpSecret = ref(false)
    2. Replace the display with:
      <div v-if="totpSecretBase32" class="mt-3">
        <p class="text-xs text-white/50 mb-1">Manual entry key (keep secret!):</p>
        <div v-if="showTotpSecret" class="flex items-center gap-2">
          <p class="text-sm font-mono text-orange-400 break-all">{{ totpSecretBase32 }}</p>
          <button class="glass-button text-xs px-2 py-1" @click="showTotpSecret = false">Hide</button>
        </div>
        <button v-else class="glass-button text-xs px-3 py-1" @click="showTotpSecret = true">
          Show manual entry key
        </button>
      </div>
      
    3. Remove the select-all class — users should deliberately copy, not accidentally select. Build and deploy. Verify TOTP setup flow still works.
  • Validate redirect URL after login: In neode-ui/src/router/index.ts, find line 231 with const redirectTo = (to.query.redirect as string) || '/dashboard'. Replace with:

    function isLocalRedirect(path: unknown): path is string {
      if (typeof path !== 'string') return false
      try {
        // Must be a relative path, not an absolute URL
        if (path.startsWith('//') || path.includes('://')) return false
        const url = new URL(path, window.location.origin)
        return url.origin === window.location.origin
      } catch {
        return false
      }
    }
    
    const redirectTo = isLocalRedirect(to.query.redirect) ? to.query.redirect : '/dashboard'
    

    Run npm run type-check. Build and deploy. Test: visit http://192.168.1.198/login?redirect=https://evil.com — after login should go to /dashboard, NOT evil.com. Visit http://192.168.1.198/login?redirect=/mesh — after login should go to /mesh.

  • Add input trimming to all auth fields: In neode-ui/src/views/Login.vue, find all password and input submissions. Add .trim() before sending:

    1. Search for password.value in the file. Wherever it's submitted via RPC (e.g., params: { password: password.value }), change to params: { password: password.value.trim() }.
    2. Do the same for TOTP code inputs, setup passwords, confirm passwords.
    3. Also check neode-ui/src/views/Settings.vue for password change forms — trim those too. Run npm run type-check. Build and deploy. Test login with a password that has trailing spaces — should still work.
  • Validate route parameters: In neode-ui/src/views/AppDetails.vue (line ~485) and neode-ui/src/views/AppSession.vue (line ~267), add app ID validation:

    1. Create a utility function in neode-ui/src/utils/ or inline:
      function isValidAppId(id: unknown): id is string {
        return typeof id === 'string' && /^[a-z0-9][a-z0-9-]*[a-z0-9]$/.test(id) && id.length <= 64
      }
      
    2. In each view's setup, validate the route param early:
      const appId = computed(() => {
        const id = route.params.id
        if (!isValidAppId(id)) {
          router.replace('/apps')
          return ''
        }
        return id
      })
      

    Build and deploy. Test: navigate to a valid app — should work. Navigate to /app/../../etc/passwd — should redirect to /apps.

  • Verify Phase 5 — Frontend hardened: Run these checks:

    1. grep -rn "v-html" neode-ui/src/ --include="*.vue" | grep -v "DOMPurify\|sanitize" — any remaining v-html should be justified.
    2. grep -rn "select-all" neode-ui/src/ --include="*.vue" — TOTP secret should NOT have select-all.
    3. npm run type-check — zero errors.
    4. npm run build — builds successfully.
    5. Test login flow, TOTP setup, app navigation, FileBrowser at http://192.168.1.198.

Phase 6: Nginx — Security Headers & Rate Limiting

Layman version: The web server (nginx) is missing security headers that tell browsers how to protect users. We add headers that prevent clickjacking, content type confusion, and XSS. We also add rate limiting so attackers can't overwhelm the server with requests.

  • Fix Content Security Policy: In image-recipe/configs/nginx-archipelago.conf, find line ~14 with the existing CSP. Replace the CSP header with a strict version:

    add_header Content-Security-Policy "default-src 'self'; script-src 'self'; style-src 'self' 'unsafe-inline'; img-src 'self' data: blob:; font-src 'self' data:; connect-src 'self' ws: wss:; frame-src 'self'; frame-ancestors 'self'; base-uri 'self'; form-action 'self';" always;
    

    Note: 'unsafe-inline' for styles is needed because Vue scoped styles sometimes inject inline styles. 'unsafe-eval' is removed — if the app breaks, it means some JS is using eval() which should be fixed in code instead. Deploy the nginx config. Test the web UI thoroughly — if anything breaks, check browser console for CSP violations and adjust the policy minimally.

  • Replace X-Frame-Options stripping with SAMEORIGIN: In image-recipe/configs/snippets/archipelago-https-app-proxies.conf, find all 38 occurrences of proxy_hide_header X-Frame-Options;. For each one, add after it:

    add_header X-Frame-Options "SAMEORIGIN" always;
    

    This allows Archipelago's own UI to iframe apps but blocks external sites from framing them. Do the same in the HTTP config in nginx-archipelago.conf. Deploy and test: open an app in the Archipelago iframe — should still load.

  • Add HSTS header: In image-recipe/configs/nginx-archipelago.conf, add to the HTTPS server block (or main server block if using HTTPS):

    add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
    

    Note: Do NOT add preload — this is a local server, not a public domain.

  • Add rate limiting to RPC endpoint: In image-recipe/configs/nginx-archipelago.conf, add at the top (before the server block):

    # Rate limit zones
    limit_req_zone $binary_remote_addr zone=rpc:10m rate=20r/s;
    limit_req_zone $binary_remote_addr zone=auth:10m rate=3r/s;
    

    Then in the /rpc/ location block, add:

    limit_req zone=rpc burst=40 nodelay;
    limit_req_status 429;
    

    For auth-specific endpoints, apply stricter limits in the backend or add a separate location for auth RPCs. Deploy and test: normal UI use should work fine. Rapid-fire requests should get 429 responses.

  • Add remaining security headers: In image-recipe/configs/nginx-archipelago.conf, add to the server block:

    add_header X-Content-Type-Options "nosniff" always;
    add_header X-DNS-Prefetch-Control "off" always;
    add_header Referrer-Policy "strict-origin-when-cross-origin" always;
    add_header Permissions-Policy "camera=(), microphone=(), geolocation=(), payment=()" always;
    

    Deploy and verify: curl -sI http://192.168.1.198 | grep -i "x-content\|referrer\|permissions\|strict-transport".

  • Verify Phase 6 — Nginx hardened: Run these checks from another machine:

    1. curl -sI http://192.168.1.198 | grep -i "content-security-policy" — CSP header present, no unsafe-eval.
    2. curl -sI http://192.168.1.198 | grep -i "x-content-type"nosniff present.
    3. curl -sI http://192.168.1.198 | grep -i "x-frame-options" — present on app proxies.
    4. curl -sI http://192.168.1.198 | grep -i "referrer-policy" — present.
    5. Rate limit test: for i in $(seq 1 100); do curl -s -o /dev/null -w "%{http_code}\n" http://192.168.1.198/rpc/v1; done | sort | uniq -c — should show some 429s.
    6. All UI features still work normally.

Phase 7: Backend — MEDIUM Severity Fixes

Layman version: These fixes improve defense-in-depth. They're not immediately exploitable like the critical bugs, but they close gaps that a sophisticated attacker could chain together. Think of it as adding deadbolts after fixing the broken window.

  • Add zeroization to SecretsManager: In core/security/src/secrets_manager.rs, the encryption key stays in memory for the lifetime of the struct. Add zeroization on drop:

    1. Add zeroize dependency to core/security/Cargo.toml if not present: zeroize = { version = "1", features = ["derive"] }.
    2. Wrap the key material in a zeroizing wrapper. Since Aes256Gcm doesn't implement Zeroize, store the raw key separately:
      use zeroize::Zeroize;
      
      pub struct SecretsManager {
          secrets_dir: PathBuf,
          cipher: Aes256Gcm,
          raw_key: zeroize::Zeroizing<[u8; 32]>, // Zeroized on drop
      }
      
    3. In the constructor, store the key bytes before creating the cipher, and wrap in Zeroizing. Build and test: secrets should still encrypt/decrypt correctly.
  • Replace thread_rng with OsRng in secrets manager: In core/security/src/secrets_manager.rs, find lines 64 and 221 where rand::thread_rng().fill_bytes() is used. Replace with:

    use rand::rngs::OsRng;
    OsRng.fill_bytes(&mut nonce_bytes);  // Line 64
    OsRng.fill_bytes(&mut new_secret_bytes);  // Line 221
    

    Build and test.

  • Encrypt the remember-me HMAC secret: In core/archipelago/src/session.rs, find lines 395-403 where the remember-me secret is stored as plaintext. Encrypt it using the secrets manager:

    1. Instead of std::fs::write(REMEMBER_SECRET_FILE, &secret), use the SecretsManager to encrypt the secret before writing.
    2. On read, decrypt using SecretsManager.
    3. If SecretsManager is not available at that point in the boot sequence, derive the secret from a combination of machine-specific data (e.g., /etc/machine-id + salt) using Argon2, so it's different per installation but deterministic. Build, deploy, and test: remember-me login should still work after restart.
  • Use checked arithmetic for Bitcoin amounts: In core/archipelago/src/wallet/ecash.rs line 64, replace the .sum() with checked addition:

    pub fn balance(&self) -> u64 {
        self.tokens.iter()
            .filter(|t| !t.spent)
            .try_fold(0u64, |acc, t| acc.checked_add(t.amount_sats))
            .unwrap_or(u64::MAX) // Saturate on overflow rather than wrapping
    }
    

    Search for other .sum() calls on monetary amounts: grep -rn "\.sum()" core/ --include="*.rs". Fix any that operate on u64 Bitcoin amounts. Build and test.

  • Create validated AppId newtype: In core/archipelago/src/api/rpc/container.rs, create a newtype for app IDs:

    #[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
    pub struct AppId(String);
    
    impl AppId {
        pub fn new(id: &str) -> Result<Self> {
            // Only allow lowercase alphanumeric + hyphens, 1-64 chars
            if id.is_empty() || id.len() > 64 {
                anyhow::bail!("App ID must be 1-64 characters");
            }
            if !id.chars().all(|c| c.is_ascii_lowercase() || c.is_ascii_digit() || c == '-') {
                anyhow::bail!("App ID must contain only lowercase letters, digits, and hyphens");
            }
            if id.starts_with('-') || id.ends_with('-') || id.contains("--") {
                anyhow::bail!("App ID must not start/end with hyphen or contain consecutive hyphens");
            }
            Ok(Self(id.to_string()))
        }
    
        pub fn as_str(&self) -> &str { &self.0 }
    }
    
    impl std::fmt::Display for AppId {
        fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
            write!(f, "{}", self.0)
        }
    }
    

    Use AppId in RPC handler signatures where app IDs are accepted. The deserializer will validate automatically. Build — fix all compilation errors from the type change. Deploy and test app operations.

  • Validate Tor service names: In core/archipelago/src/api/rpc/tor.rs, find lines 426-427 where name is used in path operations. Add validation:

    fn validate_service_name(name: &str) -> Result<()> {
        if name.is_empty() || name.len() > 64 {
            anyhow::bail!("Service name must be 1-64 characters");
        }
        if !name.chars().all(|c| c.is_ascii_alphanumeric() || c == '-' || c == '_') {
            anyhow::bail!("Service name must contain only alphanumeric characters, hyphens, and underscores");
        }
        Ok(())
    }
    

    Call validate_service_name(&name)?; before any filesystem operation with the name. Build and deploy.

  • Add per-user rate limiting on CPU-intensive RPC endpoints: In core/archipelago/src/api/rpc/mod.rs, add a rate limiter for expensive operations:

    1. Add a simple token-bucket rate limiter using a HashMap<String, (Instant, u32)> behind a Mutex.
    2. Apply rate limits to: backup.create (1/minute), container install/uninstall (5/minute), auth.totp.setup (3/minute), password change (3/minute).
    3. Return HTTP 429 with a Retry-After header when rate limited. Build and deploy. Test: rapid-fire backup requests should be throttled.
  • Implement backup recovery codes: In core/archipelago/src/auth.rs or session.rs, add recovery code generation during initial setup:

    1. Generate 8 random recovery codes (each 8 characters, alphanumeric) during password setup.
    2. Hash them with SHA-256 and store the hashes in /var/lib/archipelago/recovery-codes.json.
    3. Display the codes to the user once (they must write them down).
    4. Add an RPC endpoint auth.recover that accepts a recovery code, verifies against stored hashes, and allows password reset.
    5. Each code is single-use — delete the hash after successful use. Build, deploy, and test the full flow.
  • Verify Phase 7 — Backend medium fixes complete: Run these checks:

    1. cargo clippy --all-targets --all-features — zero warnings.
    2. cargo test --all-features — all tests pass.
    3. grep -rn "thread_rng" core/security/ --include="*.rs" — zero results.
    4. Backend starts cleanly after deploy.
    5. All UI features work: login, remember-me, app install, settings.

Phase 8: Mesh — MEDIUM Fixes & Atomic State

Layman version: The encrypted messaging system has some edge cases where a crash at the wrong moment could weaken security, and emergency alerts can be faked. We fix the crash safety and add signature checks to alerts.

  • Add alert signature verification on receive: In core/archipelago/src/mesh/listener.rs, find where emergency alerts are processed. Before displaying or relaying an alert:

    // Verify the alert is actually signed by the claimed peer
    let peer_pubkey = resolve_peer_pubkey(&envelope.sender)?;
    if !envelope.verify_signature(&peer_pubkey)? {
        tracing::warn!(
            claimed_sender = %envelope.sender,
            "Dropping emergency alert with invalid signature — possible spoofing attempt"
        );
        continue; // Skip this alert
    }
    

    Build and test.

  • Implement atomic ratchet state persistence: In core/archipelago/src/mesh/session.rs, find lines 156-159 where ratchet state is saved. Replace with atomic write (write to temp file, then rename):

    async fn save_session_atomic(&self, did: &str, state: &RatchetState) -> Result<()> {
        let path = self.session_path(did);
        let tmp_path = path.with_extension("tmp");
    
        let data = serde_json::to_vec(state)
            .context("Failed to serialize ratchet state")?;
    
        tokio::fs::write(&tmp_path, &data).await
            .context("Failed to write temporary ratchet state")?;
    
        tokio::fs::rename(&tmp_path, &path).await
            .context("Failed to atomically rename ratchet state file")?;
    
        Ok(())
    }
    

    This ensures that a crash during write leaves either the old state (intact) or the new state (complete), never a partial/corrupt file. Build and test.

  • Encrypt GPS in dead man's switch alerts: In core/archipelago/src/mesh/alerts.rs, find where GPS coordinates are included in alerts. Encrypt the GPS data for intended recipients only:

    1. Make GPS optional in the alert struct: gps: Option<EncryptedGps>.
    2. When creating an alert, encrypt GPS coordinates using each trusted peer's public key.
    3. Only intended recipients can decrypt the GPS. Other mesh relayers see the alert but not the location. Build and test.
  • Systematic unwrap audit in mesh code: Run grep -rn "\.unwrap()\|\.expect(" core/archipelago/src/mesh/ --include="*.rs" | grep -v "mod tests" | grep -v "#\[test\]". For each occurrence:

    1. If it's in message parsing/deserialization — replace with ? (incoming data is untrusted).
    2. If it's after a guaranteed check (e.g., if x.is_some() { x.unwrap() }) — refactor to if let Some(v) = x.
    3. If it's truly infallible (e.g., regex compilation of a literal) — add // SAFETY: literal regex cannot fail comment. Target: reduce unwrap/expect in non-test mesh code to under 20, all documented. Build and run full test suite.
  • Verify Phase 8 — Mesh hardened: Run these checks:

    1. cargo test --all-features — all tests pass.
    2. grep -c "unwrap()\|\.expect(" core/archipelago/src/mesh/*.rs | grep -v test — count should be under 20.
    3. Backend starts cleanly with mesh enabled.
    4. No ratchet state .tmp files left behind: ls /var/lib/archipelago/mesh/sessions/*.tmp — should be empty.

============================================================

YEAR 1 — QUARTER 3: PRODUCTION FEATURES & INFRASTRUCTURE

============================================================


Phase 9: Tor-by-Default Integration

Layman version: Currently, Tor is optional. Competitors like Start9 and nix-bitcoin route all traffic through Tor by default for maximum privacy. We match this by making Tor the default for all Bitcoin and Lightning network connections.

  • Install and configure Tor on first boot: In scripts/first-boot-containers.sh, add a Tor container (or system service) that starts before other services:

    1. Add a Tor container or verify the system Tor service is installed and enabled.
    2. Configure Tor with a SocksPort on 127.0.0.1:9050.
    3. Add hidden service configs for: web UI (port 80), LND (port 8081), Bitcoin P2P (port 8333).
    4. Save the generated .onion addresses to /var/lib/archipelago/tor-hostnames/.
  • Route Bitcoin Core through Tor by default: Add -proxy=127.0.0.1:9050 and -onlynet=onion to bitcoin-knots container flags. This routes all P2P connections through Tor, hiding the node's IP address from the Bitcoin network. Test: sudo podman exec bitcoin-knots bitcoin-cli getnetworkinfo should show only onion connections.

  • Route LND through Tor: Configure LND to use Tor for all connections. Add --tor.active --tor.socks=127.0.0.1:9050 to LND start flags. Verify LND peers are connected via Tor.

  • Add .onion URL display in web UI: In neode-ui/src/views/Settings.vue, add a section showing the node's .onion address for remote access via Tor Browser.

  • Add Tor toggle in settings: Allow users to disable Tor if they prefer clearnet (some use cases require it). Default should be Tor-on.

  • Verify Phase 9 — Tor active: Bitcoin peers are onion-only, LND via Tor, .onion address displayed in UI.


Phase 10: Encrypted Backup System

Layman version: If your hardware dies, you lose everything — Bitcoin wallet, Lightning channels, all app data. We build an encrypted backup system so you can restore to new hardware. Start9 has this; we need it too.

  • Design backup manifest: Create a backup manifest that lists what to back up per app: data directories, config files, secrets. Store in apps/{app-id}/manifest.yml under a backup: key.

  • Implement encrypted backup creation: Add an RPC endpoint backup.create that:

    1. Snapshots all app data directories using tar.
    2. Encrypts the tarball with AES-256-GCM using a key derived from the user's master password + Argon2.
    3. Saves to a configurable destination (local USB, network share, etc.).
    4. Shows progress in the UI.
  • Implement encrypted backup restore: Add an RPC endpoint backup.restore that:

    1. Accepts a backup file and the master password.
    2. Decrypts and verifies integrity.
    3. Stops affected containers, restores data, restarts containers.
    4. Handles version migration if backup is from an older version.
  • Add scheduled backups: Allow users to configure automatic backups (daily/weekly) to external storage.

  • Verify Phase 10 — Backup/restore works: Create a backup, delete an app's data, restore from backup, verify app works.


Phase 11: Automated Update System

Layman version: Currently, updates require SSH access and running a script manually. Users need a "click to update" button like Umbrel has. We build this with atomic updates that can roll back if something breaks.

  • Design update architecture: Plan the update mechanism:

    1. Backend checks for updates by fetching a signed manifest from a known URL (or local file for air-gapped).
    2. Updates are downloaded as delta tarballs (frontend + backend binary).
    3. Applied atomically: new binary placed alongside old, symlink swapped.
    4. Rollback: if health check fails after update, swap symlink back.
  • Implement update check RPC endpoint: Add system.check_updates that fetches the update manifest and returns available version + changelog.

  • Implement update apply RPC endpoint: Add system.apply_update that downloads, verifies signature, applies, and restarts.

  • Add rollback mechanism: If the backend fails to start after update (health check via systemd), automatically roll back to previous binary.

  • Add update UI in Settings: Show current version, available updates, changelog, and "Update Now" button with progress indicator.

  • Verify Phase 11 — Updates work: Simulate an update (place a new binary version), apply it, verify the system comes back up. Simulate a bad update, verify rollback.


Phase 12: App Ecosystem Expansion

Layman version: We have ~15 apps. The Bitcoin essentials are covered, but users expect at least 30 apps to compete with Start9/RaspiBlitz. We add the most-requested apps with proper security hardening.

  • Add missing essential Bitcoin apps: Ensure these are available and work out of the box:

    1. Fulcrum (Electrum server alternative — faster than Electrs for large wallets)
    2. Thunderhub (Lightning management — alternative to Ride the Lightning)
    3. LNbits (Lightning toolkit with extensions)
    4. Lightning Terminal (Loop, Pool, Faraday in one UI)
    5. Specter Desktop (multisig wallet management)
  • Add privacy-enhancing apps:

    1. JoinMarket / JAM (CoinJoin — RaspiBlitz has this, we should too)
    2. Whirlpool CLI (if legally permissible post-Samourai)
  • Add self-hosting essentials:

    1. Matrix / Synapse (decentralized chat)
    2. Gitea (self-hosted Git)
    3. WireGuard (VPN — nix-bitcoin has this)
  • Harden all new app manifests: Every new app must have:

    • readonly_root: true
    • cap_drop: ALL + only required caps added
    • Non-root user (UID > 1000)
    • no-new-privileges: true
    • Pinned image by SHA256 digest
    • Health check configured
  • Verify Phase 12 — All apps work: Install each new app, verify it starts, verify the UI loads, verify it connects to Bitcoin/Lightning if needed.


============================================================

YEAR 1 — QUARTER 4: PRODUCTION READINESS

============================================================


Phase 13: Advanced Security Hardening

Layman version: We've fixed all the known bugs. Now we add proactive security measures — things that prevent entire classes of bugs from being exploitable, even if new bugs are introduced later.

  • Add Content Security Policy nonce support: Replace 'unsafe-inline' in CSP with nonce-based script loading. This requires the backend to generate a random nonce per page load and inject it into both the CSP header and the script tags.

  • Implement session timeout: In core/archipelago/src/session.rs, add configurable session timeout (default 24 hours, configurable in settings). Auto-expire sessions that haven't been active.

  • Add "active sessions" management: Show all active sessions in the Settings UI with last-active time and IP. Allow users to terminate individual sessions or "log out everywhere."

  • Require re-authentication for sensitive operations: Password change, 2FA setup/disable, and recovery code regeneration should require entering the current password, even if already logged in.

  • Implement audit logging: Log all security-relevant events (login, logout, failed login, password change, 2FA change, app install/uninstall) to a dedicated audit log file with timestamps and source IPs.

  • Verify Phase 13: Session timeout works, active sessions visible, re-auth required for sensitive ops, audit log populated.


Phase 14: ISO Build Hardening

Layman version: The ISO installer creates the initial system. We harden it so that a freshly installed Archipelago is secure out of the box — no manual hardening needed.

  • Force password change on first boot: The installer should require setting a unique admin password. No default passwords should work after first boot.

  • Enable automatic security updates for the OS: Configure unattended-upgrades for Debian security patches only (not full upgrades).

  • Harden SSH configuration: In the installed system's sshd_config:

    1. Disable password authentication (key-only).
    2. Disable root login.
    3. Use ed25519 host keys only. Note: This is for the PRODUCTION installed system, not the dev server.
  • Configure firewall (UFW): Enable UFW on first boot with:

    • Allow: 80 (HTTP), 443 (HTTPS), 8333 (Bitcoin P2P), 9735 (Lightning P2P)
    • Allow: Podman container networking (forward policy ACCEPT)
    • Deny: everything else by default
  • Pin all container images in first-boot script by SHA256 digest: Replace any remaining :latest or version-only tags with image@sha256:... digests. Document how to update digests when new versions are released.

  • Verify Phase 14: Flash a test ISO, boot it, verify all hardening is active, verify apps work.


Phase 15: Penetration Test Round 1

Layman version: We've fixed everything we know about. Now we try to break in ourselves to find what we missed. This is a structured attempt to attack the system from different angles.

  • Network-level testing: From another machine on the LAN:

    1. Port scan: nmap -sV 192.168.1.198 — only expected ports should be open.
    2. Try accessing Bitcoin RPC directly: curl http://192.168.1.198:8332 — should fail.
    3. Try accessing container ports that shouldn't be exposed.
    4. Test rate limiting: spam the login endpoint.
  • Web application testing:

    1. Test for XSS: inject <script>alert(1)</script> in every input field.
    2. Test for CSRF: craft cross-origin POST to /rpc/v1 from a different origin — should fail.
    3. Test for open redirect: ?redirect=https://evil.com — should not redirect externally.
    4. Test for path traversal: ../../etc/passwd in app IDs, file paths.
    5. Check CSP: browser console should show no violations during normal use.
    6. Check cookies: all session cookies should have Secure, SameSite flags.
  • Authentication testing:

    1. Brute force login: 100 rapid login attempts — should be rate limited.
    2. Session fixation: use an old session token after logout — should fail.
    3. TOTP bypass: try using old TOTP codes — should fail (replay protection).
    4. Remember-me token: should not work after password change.
  • Container escape testing:

    1. Verify all containers run as non-root: sudo podman inspect --format '{{.Config.User}}' $(sudo podman ps -q).
    2. Verify read-only root: sudo podman exec {container} touch /test-file — should fail.
    3. Verify no capabilities beyond required: sudo podman inspect --format '{{.HostConfig.CapDrop}} {{.HostConfig.CapAdd}}' $(sudo podman ps -q).
  • Document all findings: Create a test report with pass/fail for each test. Fix any failures found.


Phase 16: Documentation & User Guides

Layman version: The best security in the world is useless if users can't set it up correctly. We write clear guides so anyone can install, configure, and maintain their node securely.

  • Write installation guide: Step-by-step guide from downloading the ISO to first login.

  • Write security best practices guide: How to keep your node secure — password strength, 2FA setup, backup procedures, network security.

  • Write app integration guide: How each app connects to Bitcoin/Lightning, what data it stores, how to back it up.

  • Write recovery guide: What to do if you lose your password, how to restore from backup, how to migrate to new hardware.

  • Verify Phase 16: Have someone unfamiliar with the project follow the guides and report any confusion.


============================================================

YEAR 2 — QUARTERS 1-2: POLISH, SCALE, AND ADVANCED FEATURES

============================================================


Phase 17: Reproducible Builds

Layman version: Users should be able to verify that the binary they're running was built from the exact source code they can read. This prevents supply chain attacks — nobody can sneak in malicious code without it being visible in the source.

  • Containerized build environment: Create a Dockerfile that builds the Rust backend and Vue frontend in a deterministic environment (pinned Rust version, pinned Node version, pinned system libraries).

  • Publish build checksums: After each release build, publish SHA256 checksums of all artifacts (backend binary, frontend bundle, ISO image).

  • Document verification process: Write instructions for users to verify their installed binary matches the published checksum.

  • Verify Phase 17: Build the same commit twice in the containerized environment — checksums should match.


Phase 18: Mobile Companion & Remote Access

Layman version: Umbrel has a mobile app. Start9 uses Tor .onion addresses for remote access. We need at least one of these so users can check on their node from their phone.

  • Implement Tor hidden service for web UI: The web UI should be accessible via a .onion address from Tor Browser on any device, anywhere in the world, without port forwarding.

  • Optimize web UI for mobile: Make the Vue UI responsive for phone-sized screens. Test on iOS Safari and Android Chrome.

  • Add PWA support: Make the web UI installable as a Progressive Web App on mobile devices.

  • Verify Phase 18: Access the node via Tor Browser on a phone. Install as PWA. All core features work on mobile.


Phase 19: CoinJoin Integration

Layman version: RaspiBlitz has JoinMarket, RoninDojo had Whirlpool. CoinJoin is essential for Bitcoin privacy — it mixes your coins with others so transactions can't be traced back to you.

  • Integrate JoinMarket/JAM: Add JoinMarket as a containerized app with the JAM web UI. Auto-connect to the local Bitcoin Core instance.

  • Add CoinJoin guide: Document how to use JoinMarket for privacy, including maker/taker roles and fee settings.

  • Verify Phase 19: JoinMarket starts, connects to Bitcoin Core, JAM UI accessible, can create a test CoinJoin (testnet or small amount).


Phase 20: Advanced Mesh Features

Layman version: The mesh networking is already unique. Now we polish it — make it more reliable, add peer reputation (trust peers who send valid data), and improve the steganography to resist more sophisticated analysis.

  • Implement peer reputation system: Track which peers send valid vs invalid data. Peers that consistently send valid block headers get higher trust scores. Peers that send invalid data get deprioritized.

  • Improve steganography resistance: Add timing jitter to mesh transmissions so traffic patterns don't reveal communication. Vary message sizes to resist traffic analysis.

  • Add mesh health dashboard: Show mesh network status, connected peers, message latency, relay statistics in the web UI.

  • Verify Phase 20: Mesh connects, messages relay, peer reputation tracks correctly, steganography modes work.


============================================================

YEAR 2 — QUARTERS 3-4: FINAL HARDENING & v1.0

============================================================


Phase 21: Penetration Test Round 2

Layman version: We did this in Phase 15 with the early fixes. Now we repeat it with the full production system including all new features. This is the final check before v1.0.

  • Repeat all Phase 15 tests: Network, web, auth, container — every test from Phase 15.

  • Test new features: Tor access, backup/restore, updates, CoinJoin, mesh.

  • Test adversarial mesh scenarios:

    1. Rogue peer sending fake identities — should be rejected (Phase 4 fix).
    2. Rogue peer sending invalid Bitcoin data — should be filtered (Phase 4 fix).
    3. Rogue peer sending fake emergency alerts — should be rejected (Phase 8 fix).
    4. Replay attack on mesh messages — sequence numbers should detect.
  • Test disaster recovery:

    1. Kill the server during a backup — verify partial backups are handled safely.
    2. Kill the server during an update — verify rollback works.
    3. Corrupt the ratchet state file — verify atomic persistence prevented data loss (Phase 8 fix).
    4. Lose the admin password — verify recovery codes work (Phase 7 fix).
  • Document all findings and fix any issues.


Phase 22: Dependency Audit & Supply Chain

Layman version: Our code might be secure, but if a library we depend on has a vulnerability, we're still exposed. We audit every dependency.

  • Run cargo audit: cd core && cargo install cargo-audit && cargo audit. Fix or document all advisories.

  • Run npm audit: cd neode-ui && npm audit. Fix all critical and high severity issues.

  • Review transitive dependencies: For each direct dependency, check its dependency tree for abandoned or suspicious packages.

  • Pin all Cargo.lock and package-lock.json: Ensure these lock files are committed and used in all builds.

  • Set up automated dependency monitoring: Configure Dependabot or similar for automated security alerts on dependency vulnerabilities.

  • Verify Phase 22: Zero critical/high advisories in both cargo audit and npm audit.


Phase 23: Performance & Reliability Under Load

Layman version: Security under normal use is one thing. Security under stress (many users, large blockchain, limited resources) is another. We test that the system remains stable and secure when pushed to its limits.

  • Stress test RPC endpoints: Send 1000 concurrent RPC requests — verify rate limiting works and the server doesn't crash.

  • Test with full blockchain: Verify the system handles a 600GB+ blockchain without running out of disk space, memory, or CPU.

  • Test mesh under high message volume: Send 100 messages per minute through the mesh — verify encryption/decryption keeps up and memory doesn't leak.

  • Test container resource limits: Start all apps simultaneously — verify memory and CPU limits prevent any single app from starving others.

  • Monitor for memory leaks: Run the backend for 7 days continuously. Monitor RSS memory — should be stable, not growing.

  • Verify Phase 23: System stable after 7 days of continuous operation with all apps running.


Phase 24: Final Review & v1.0 Release

Layman version: Everything is fixed, tested, hardened, and tested again. This is the final review before declaring the system production-ready.

  • Full code review: Review every module one more time:

    1. core/security/ — secrets manager, image verifier, AppArmor
    2. core/archipelago/src/api/ — all RPC endpoints
    3. core/archipelago/src/mesh/ — all mesh code
    4. core/container/ — Podman client
    5. neode-ui/src/api/ — RPC client, WebSocket, container client
    6. neode-ui/src/views/ — all views
    7. image-recipe/configs/ — nginx, systemd
    8. scripts/ — first-boot, deploy
  • Verify all Phase checks pass: Go through every "Verify Phase N" checklist from Phases 1-23. Every check must pass.

  • Compare against competitors one final time: Re-evaluate the competitive comparison table. Document where Archipelago stands on every dimension.

  • Create security advisory process: Document how security vulnerabilities should be reported, triaged, and disclosed. Create a SECURITY.md in the repository.

  • Tag v1.0 release: Create the release with full changelog, checksums, and documentation.

  • Build and publish v1.0 ISO: Final ISO build with all hardening active.


============================================================

APPENDIX A: COMPETITIVE COMPARISON (Reference)

============================================================

This section is informational — it explains WHERE Archipelago stands versus competitors so each phase's priorities are clear.

Architecture Comparison

Archipelago

  • Language: Rust + Vue 3 + TypeScript
  • Containers: Podman (rootless)
  • OS: Debian 12
  • Status: Pre-production (2024)

Umbrel

  • Language: TypeScript + Node.js + React
  • Containers: Docker (root daemon)
  • OS: Custom Debian
  • Status: Production (since 2020, 10.8k GitHub stars)

Start9 (StartOS)

  • Language: Rust + TypeScript
  • Containers: Docker
  • OS: Custom Linux
  • Status: Production (since 2020, 1.6k GitHub stars)

RaspiBlitz

  • Language: Python + Bash
  • Containers: None (bare metal systemd)
  • OS: Raspberry Pi OS
  • Status: Production (since 2018, 2.6k GitHub stars, 207 contributors)

myNode

  • Language: Python + Bash
  • Containers: Docker (partial)
  • OS: Debian
  • Status: Production (since 2019, 730 GitHub stars)

Nodl

  • Language: Unknown (proprietary)
  • Containers: Unknown
  • OS: Custom Linux
  • Status: Production (since 2018, hardware-only)

nix-bitcoin

  • Language: Nix + Shell
  • Containers: None (systemd services)
  • OS: NixOS
  • Status: Production (since 2018, 600 GitHub stars)

RoninDojo

  • Language: Bash
  • Containers: Docker
  • OS: Debian 12
  • Status: Uncertain (Samourai arrest impact, since 2019)

Citadel

  • Language: TypeScript (Umbrel fork)
  • Containers: Docker
  • OS: Pi OS
  • Status: Abandoned (since 2022, 137 GitHub stars)

Security Comparison

Archipelago — Rootless containers, AES-256-GCM secrets, TOTP 2FA, Signal protocol mesh. Needs: systemd hardening (Phase 2), credential rotation (Phase 1).

Umbrel — Root Docker, plaintext secrets, no 2FA, no LAN encryption. Known critical vuln: default passwords allowed fund theft. License: PolyForm NC (NOT open source).

Start9 — Docker containers, encrypted backups, self-signed CA for LAN HTTPS, Tor default. Strongest incumbent security posture among GUI-based platforms.

RaspiBlitz — No containers (bare metal), separate bitcoin user, fully transparent. No sandboxing, bash scripts are fragile.

myNode — Mixed Docker/systemd, basic security, Tor optional. License: CC-NC-ND (restrictive).

Nodl — Full disk encryption, physical kill switch, RAID redundancy. Best hardware security. Software details not public.

nix-bitcoin — BEST SECURITY overall. Hardened kernel, seccomp-bpf, namespace isolation, systemd sandboxing, reproducible builds, security bounty fund. No GUI (CLI only).

RoninDojo — Privacy-first (Whirlpool CoinJoin), Tor default. Future uncertain due to Samourai legal situation.


Unique Features Only Archipelago Has

  1. Mesh networking (LoRa/RF peer-to-peer)
  2. Off-grid Bitcoin relay (TX + block headers over radio)
  3. Signal Protocol encrypted P2P (X3DH + Double Ratchet)
  4. Steganography (data as weather/sensor readings)
  5. Dead man's switch (automated emergency alerts)
  6. Rootless containers (Podman — no root daemon)
  7. TOTP 2FA on web UI
  8. Encrypted secrets manager (AES-256-GCM at rest)

Features Archipelago Needs to Add

  1. Tor-by-default (Phase 9) — Start9, nix-bitcoin, RoninDojo have this
  2. Encrypted backups (Phase 10) — Start9 has this
  3. Automated updates (Phase 11) — Umbrel, Start9, Nodl have this
  4. Larger app ecosystem (Phase 12) — Umbrel has 300+
  5. Systemd hardening (Phase 2) — nix-bitcoin has this
  6. CoinJoin (Phase 19) — RaspiBlitz, RoninDojo have this
  7. Mobile access (Phase 18) — Umbrel, Start9 have this
  8. Reproducible builds (Phase 17) — nix-bitcoin has this

============================================================

APPENDIX B: DEV ENVIRONMENT (OUT OF SCOPE)

============================================================

These items are INTENTIONAL development tooling. They exist for convenience on a private home LAN. They are NOT production security issues. DO NOT CHANGE THEM.

  1. SSH keys and passwords in deploy scripts — Used to deploy from Mac to dev server over home LAN. StrictHostKeyChecking=no is acceptable for a known server on a trusted network.

  2. password123 default in dev mode — Only active when config.dev_mode is true. Not compiled into production builds. Used for rapid development iteration.

  3. Test script passwords — Test scripts (test-security.sh, test-app-install.sh) use known passwords for automated testing against dev servers.

  4. SSH credentials in CLAUDE.md — Development convenience for AI-assisted deployment. The dev server is behind a home router with no port forwarding.

  5. Deploy script SSH configscripts/deploy-config.sh stores dev server access credentials. Gitignored. Not part of the production system.

  6. Mock backend (neode-ui/mock-backend.js) — Dev-only Node.js server for frontend development. Never deployed to production. Uses password123 for testing.

These are all standard development practices for a pre-production project on a private network. The production system (what gets installed via ISO) does not use any of these credentials.