Skip to content

Capacity planning

/agents/<id>Capacity tab shows per server a 14-day evolution of CPU%, mem%, and network throughput, plus a linear-regression projection of “when do you hit your ceiling”. Disk growth is under the Disks tab.

What the hub does

Per metric per agent it computes:

SELECT time_bucket('1 day', time)::DATE AS day,
AVG(metric_expr)::FLOAT AS avg,
MAX(metric_expr)::FLOAT AS max
FROM <metric_table>
WHERE agent_id = X AND time > NOW() - INTERVAL '14 days'
GROUP BY 1 ORDER BY 1;

Then linear regression over day-index → avg:

slope = (n × ΣXY − ΣX × ΣY) / (n × ΣX² − ΣX²)

days_until_ceiling = (ceiling − current_avg) / slope, with date projection. Negative slope or =0 → no projection.

For disk per mount: identical but from disk_metrics per (agent, mount_point). Ceiling = total_bytes.

What you see in the UI

Three cards in the Capacity tab:

  • CPU 14d — daily avg + max bars, current value, slope/day, projection date when slope > 0
  • Memory 14d — same
  • Network 14d — total rx+tx Mbps. No ceiling (no fixed limit), only slope trend.

Per card you see:

current avg X.XX % · slope +0.012/day
[14 daily bars: light = max, dark = avg]
~ 73 days until 100% (2026-07-22)

The Disks tab shows per mount-point the same plus inode%, perms, fs-type. “Days until full” coloured: red < 30d, yellow < 90d, otherwise muted.

What this is NOT

  • Not ML / time-series forecasting. Linear fit on 14 points — works for monotone growth (logs that accumulate, db storage). Works poorly for spiky/seasonal workloads.
  • No confidence interval. You get one projection date, not a [P5–P95] band.
  • No multi-resource model. CPU projection knows nothing about memory pressure.

For real capacity planning on critical systems, a spreadsheet with expert judgement or a tool like Prophet. This tab covers the 80%-case: “when will this disk fill up”.

Tip: alert rule + capacity together

Combine with alert builder:

metric: disk_pct
operator: >
threshold: 85
duration: 3600 (1h above 85%)
severity: warn

The Capacity tab probably already shows you weeks in advance when that’s going to happen. The alert is the failsafe — capacity planning is the early warning.