release(v1.7.14-alpha): install overlay + FIPS real fix + AIUI restore

Install UX
  SystemUpdate.vue now shows a full-screen overlay after apply: the
  BitcoinFaceAscii logo, a target-version label, an indeterminate
  progress stripe (solid orange; solid green on ready), and an
  elapsed-time readout. Polls /health every 1.5s and auto-reloads
  once the backend reports the new version. 3-min stall → "Reload
  now" button. Download UI also shows a spinner + "Finishing
  download — verifying checksum…" while the fake bar sits at 95%.

FIPS reconnect — for real this time
  New fips.reconnect RPC does stop → start → wait 20s → re-poll →
  classify. Classification buckets: connected / daemon_down /
  no_seed_key / no_outbound_udp_or_anchor_down / peers_but_no_anchor,
  each with a plain-language hint surfaced verbatim by the Reconnect
  button. The real reason nodes like .198/.253 couldn't reach the
  anchor: identity::write_fips_key_from_seed was writing fips_key.pub
  as a bech32 npub TEXT file, but upstream fips expects 32 raw
  bytes. The daemon silently authenticated with garbage. Fix:
  PublicKey::to_bytes() → raw 32 bytes, and new
  fips::config::normalize_pub_file migrates legacy files by decoding
  the npub and rewriting in place. fips.reconnect also re-installs
  the config + healed keys to /etc/fips before restarting.

AIUI preservation + restore
  apply_update was wiping /opt/archipelago/web-ui/aiui because the
  Vue build doesn't include it — every OTA lost the Claude sidebar.
  The preserve block now copies aiui/ + archipelago-companion.apk
  from the old web-ui into the staging dir before the swap, and
  prefers new-tar versions if present. To restore it on the three
  nodes that already lost it (.116/.198/.253), this release bundles
  the 85 MB aiui build into the frontend tarball. Frontend component
  size is now ~155 MB.

Download / install timeouts
  Backend download client timeout 1800s → 3600s (1 h). Larger
  tarball + slow gitea raw throughput put us above the old cap.
  Frontend update.download rpc timeout 30 min → 65 min to match.
  package.install rpc timeout 15 min → 45 min — IndeedHub pulls
  6 images and was timing out mid-install.

UI nit
  "Rollback to Previous" → "Rollback Available".

App-catalog proxy already landed in v1.7.13.

Artefacts:
  archipelago                                      725e18e6…3c525e6   40462288
  archipelago-frontend-1.7.14-alpha.tar.gz         c35284be…ff2c16   162077052 (+aiui)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Dorian
2026-04-20 16:40:25 -04:00
parent 687c216e65
commit be8e5ee46b
12 changed files with 414 additions and 43 deletions

View File

@@ -1,6 +1,6 @@
[package]
name = "archipelago"
version = "1.7.13-alpha"
version = "1.7.14-alpha"
edition = "2021"
description = "Archipelago Bitcoin Node OS - Native backend"
authors = ["Archipelago Team"]

View File

@@ -413,6 +413,7 @@ impl RpcHandler {
"fips.apply-update" => self.handle_fips_apply_update().await,
"fips.install" => self.handle_fips_install().await,
"fips.restart" => self.handle_fips_restart().await,
"fips.reconnect" => self.handle_fips_reconnect().await,
// System updates
"update.check" => self.handle_update_check().await,

View File

@@ -44,4 +44,89 @@ impl RpcHandler {
fips::service::restart(fips::SERVICE_UNIT).await?;
Ok(serde_json::json!({ "restarted": true }))
}
/// Full reconnect: stop the daemon, bring it back, wait for the DHT
/// bootstrap window, poll the identity-cache + peer list, and
/// classify what recovered (or didn't) so the UI can explain it to
/// the user instead of showing a generic failure.
///
/// Runtime: ~20s. Needs an RPC timeout ≥ 45s on the client.
pub(super) async fn handle_fips_reconnect(&self) -> Result<serde_json::Value> {
let identity_dir = fips::identity_dir_from(&self.config.data_dir);
let before = fips::FipsStatus::query(&identity_dir).await;
// Heal the pre-fix bech32-text fips_key.pub → 32-raw-bytes
// mismatch. The daemon silently authenticates with a garbage
// pubkey when the .pub file is 63-char text, which looks like
// "anchor unreachable" to the user even though the real fault
// was an identity malformed on the node itself. Re-install the
// config + keys so /etc/fips gets the healed .pub.
let key_src = identity_dir.join("fips_key");
let pub_src = identity_dir.join("fips_key.pub");
if key_src.exists() {
let _ = fips::config::normalize_pub_file(&key_src, &pub_src).await;
// Re-install refreshes /etc/fips/fips.pub from the healed
// source. No-op if nothing changed.
let _ = fips::config::install(&identity_dir).await;
}
// Clean stop+start rather than `restart`, so a daemon that
// fails to come back up surfaces as service_active=false
// instead of quietly sticking with the old process.
let _ = fips::service::stop(fips::SERVICE_UNIT).await;
tokio::time::sleep(std::time::Duration::from_millis(800)).await;
fips::service::activate(fips::SERVICE_UNIT).await?;
// Anchor bootstrap window: poll the status every ~3s for up to
// 20s. Bail as soon as the anchor is connected.
let mut last_status: Option<fips::FipsStatus> = None;
let deadline = std::time::Instant::now() + std::time::Duration::from_secs(20);
loop {
tokio::time::sleep(std::time::Duration::from_secs(3)).await;
let s = fips::FipsStatus::query(&identity_dir).await;
if s.anchor_connected {
last_status = Some(s);
break;
}
last_status = Some(s);
if std::time::Instant::now() >= deadline {
break;
}
}
let after = last_status.unwrap_or_else(|| before.clone());
let recovered = after.anchor_connected && !before.anchor_connected;
let likely_cause = if after.anchor_connected {
"connected"
} else if !after.service_active {
"daemon_down"
} else if !after.key_present {
"no_seed_key"
} else if after.authenticated_peer_count == 0 {
// Daemon is up with a key but hasn't authenticated any
// peers — almost always outbound UDP/8668 dropped by the
// local firewall/router, or the anchor itself being down.
"no_outbound_udp_or_anchor_down"
} else {
"peers_but_no_anchor"
};
let hint = match likely_cause {
"connected" => "Anchor is reachable.",
"daemon_down" => "The FIPS daemon didn't come back up — check archipelago-fips.service.",
"no_seed_key" => "No seed-derived FIPS key on disk. Re-run the onboarding unlock step.",
"no_outbound_udp_or_anchor_down" =>
"Daemon is running but no peers handshook. Your router / ISP might be blocking outbound UDP 8668, or the anchor (fips.v0l.io) could be down.",
"peers_but_no_anchor" =>
"Mesh has peers but the anchor hasn't been seen yet. Give it a minute and re-check.",
_ => "",
};
Ok(serde_json::json!({
"recovered": recovered,
"likely_cause": likely_cause,
"hint": hint,
"before": before,
"after": after,
}))
}
}

View File

@@ -78,11 +78,65 @@ pub async fn install(identity_dir: &Path) -> Result<()> {
install_result?;
sudo_install_file(&src_key, DAEMON_KEY_PATH, "0600").await?;
// Heal a legacy fips_key.pub that was written as bech32 npub text
// (pre-fix identity::write_fips_key_from_seed did this). Upstream
// fips expects 32 raw bytes; a text file silently passes through
// and then the daemon can't identify itself to peers. This
// rewrites the source file in place with the correct binary form
// derived from fips_key before staging it to /etc/fips/fips.pub.
normalize_pub_file(&src_key, &src_pub).await?;
sudo_install_file(&src_pub, DAEMON_PUB_PATH, "0644").await?;
Ok(())
}
/// Ensure `fips_key.pub` is 32 raw bytes. If it's a bech32 npub text
/// file (from the pre-fix writer), decode it and rewrite in place. If
/// the file is missing or its content doesn't match either format,
/// re-derive the public key from `fips_key` and write that.
pub async fn normalize_pub_file(key_path: &Path, pub_path: &Path) -> Result<()> {
// Happy path: already 32 raw bytes.
if let Ok(bytes) = tokio::fs::read(pub_path).await {
if bytes.len() == 32 {
return Ok(());
}
// bech32 npub text from the pre-fix writer: decode in place.
if let Ok(s) = std::str::from_utf8(&bytes) {
let trimmed = s.trim();
if trimmed.starts_with("npub1") {
if let Ok(pk) = nostr_sdk::PublicKey::parse(trimmed) {
let raw: [u8; 32] = pk.to_bytes();
tokio::fs::write(pub_path, raw)
.await
.context("rewriting fips_key.pub as 32 raw bytes")?;
tracing::info!(
"Migrated legacy bech32 fips_key.pub to raw-byte form at {}",
pub_path.display()
);
return Ok(());
}
}
}
}
// Fallback: no pub file, or unreadable format. Re-derive from the
// private key file (already validated by load_fips_keys).
let secret_bytes = tokio::fs::read(key_path)
.await
.with_context(|| format!("read {} to derive public", key_path.display()))?;
let text = std::str::from_utf8(&secret_bytes)
.context("fips_key is not UTF-8 — can't derive public")?;
let secret = nostr_sdk::SecretKey::parse(text.trim())
.context("fips_key not parseable as bech32 nsec")?;
let keys = nostr_sdk::Keys::new(secret);
let raw: [u8; 32] = keys.public_key().to_bytes();
tokio::fs::write(pub_path, raw)
.await
.context("writing re-derived fips_key.pub")?;
tracing::info!("Re-derived fips_key.pub from fips_key");
Ok(())
}
async fn sudo_install_dir(path: &str) -> Result<()> {
let out = Command::new("sudo")
.args(["install", "-d", "-m", "0755", path])

View File

@@ -219,14 +219,22 @@ async fn write_fips_key_from_seed(
.await
.context("Failed to set FIPS key permissions")?;
}
let npub = keys.public_key().to_bech32().unwrap_or_default();
fs::write(&pub_path, format!("{npub}\n"))
// Upstream fips daemon expects 32 raw bytes in /etc/fips/fips.pub —
// not a bech32 npub string. Writing the bech32 form here meant the
// installed .pub file was a 63-char text file the daemon parsed as
// 63 raw bytes of garbage, so it couldn't identify itself to peers
// and the anchor never handshook. Write the raw public-key bytes
// (PublicKey::to_bytes returns a [u8; 32]) so the daemon reads
// them directly.
let raw_pub: [u8; 32] = keys.public_key().to_bytes();
fs::write(&pub_path, raw_pub)
.await
.context("Failed to write FIPS public key")?;
let npub_for_log = keys.public_key().to_bech32().unwrap_or_default();
tracing::info!(
"Derived FIPS mesh key from seed (npub: {}...)",
npub.chars().take(20).collect::<String>()
npub_for_log.chars().take(20).collect::<String>()
);
Ok(())
}

View File

@@ -176,7 +176,12 @@ pub async fn download_update(data_dir: &Path) -> Result<DownloadProgress> {
.context("Failed to create staging dir")?;
let client = reqwest::Client::builder()
.timeout(std::time::Duration::from_secs(1800))
// 1h per component — the bundled frontend+aiui tarball sits at
// ~160 MB and git.tx1138.com raw serves at ~70 KB/s which puts
// the worst case above the old 30 min cap. A larger timeout
// with a tight connect_timeout keeps hung connections from
// swallowing the whole budget.
.timeout(std::time::Duration::from_secs(3600))
.connect_timeout(std::time::Duration::from_secs(30))
.build()
.context("Failed to create HTTP client")?;
@@ -369,6 +374,21 @@ pub async fn apply_update(data_dir: &Path) -> Result<()> {
])
.await;
// Preserve paths that are installed outside the Vue build
// (baked in by the ISO or sibling installers) and so
// aren't in the new tarball. Without this copy, every OTA
// wipes them — notably aiui/ (Claude Code sidebar) and
// the companion APK. `cp -a` preserves mode/ownership.
for preserved in ["aiui", "archipelago-companion.apk"] {
let src = format!("{}/{}", web_ui, preserved);
let dst = format!("{}/{}", staging_new, preserved);
// Only preserve the old copy if the new tarball
// doesn't already ship a fresher one.
if Path::new(&src).exists() && !Path::new(&dst).exists() {
let _ = host_sudo(&["cp", "-a", &src, &dst]).await;
}
}
// Swap: mv current web-ui aside, then mv new into place.
if Path::new(web_ui).exists() {
let mv_old = host_sudo(&["mv", web_ui, &staging_old])