feat: add alerting system with configurable rules and UI (MON-02, MON-03)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -338,9 +338,9 @@
|
||||
|
||||
- [x] **MON-01** — Implement real-time metrics collection. Add `core/archipelago/src/monitoring/collector.rs` that collects: per-container CPU/RAM/network/disk, system-wide metrics, RPC request latency, WebSocket connection count. Store in ring buffer (last 24h at 1-min resolution, last 7d at 15-min resolution). **Acceptance**: Metrics collected and queryable via RPC.
|
||||
|
||||
- [ ] **MON-02** — Add monitoring dashboard page. Create `neode-ui/src/views/Monitoring.vue` at `/dashboard/monitoring` with: real-time line charts (CPU, RAM, network), per-container resource breakdown, alert history, system health timeline. Use canvas-based charts (no heavy library -- build simple line chart component). **Acceptance**: Real-time metrics visible with 5s refresh.
|
||||
- [x] **MON-02** — Add monitoring dashboard page. Created `neode-ui/src/views/Monitoring.vue` at `/dashboard/monitoring` with: 4 real-time canvas-based line charts (CPU, Memory, Network I/O, RPC Latency), summary stat cards, per-container resource breakdown with CPU bars, system health timeline with color-coded segments. Custom `LineChart.vue` component renders on canvas with DPR scaling, grid lines, area fills. Polls every 5s via `monitoring.current` and `monitoring.history` RPC endpoints. Route registered in router. All CSS classes defined in style.css.
|
||||
|
||||
- [ ] **MON-03** — Implement alerting system. Add alert rules: disk > 90%, RAM > 90%, container crash, backend error spike, SSL cert expiry < 30 days. Notifications via: WebSocket push to UI, optional webhook URL. **Acceptance**: Alerts fire and display in UI.
|
||||
- [x] **MON-03** — Implemented alerting system. Added AlertRule and FiredAlert types to monitoring/mod.rs with 5 configurable rules (disk >90%, RAM >90%, container crash, RPC latency spike, SSL cert expiry <30 days). Metrics collector evaluates rules every 60s, fires alerts as Notifications via WebSocket. Added RPC endpoints: monitoring.alerts, monitoring.alert-rules, monitoring.configure-alert, monitoring.acknowledge-alert. Frontend: Monitoring.vue has alert history section with configurable thresholds, enable/disable toggles, dismiss buttons. CSS toggle/input styles in style.css.
|
||||
|
||||
- [ ] **MON-04** — Add historical data export. Add `monitoring.export` RPC endpoint that exports metrics as CSV or JSON for a given time range. Add "Export" button in monitoring UI. **Acceptance**: Can download last 24h of metrics as CSV.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user