Anomaly detection on KPIs that actually matter
Anomaly detection fails the same way in every BI deployment
The pattern is consistent. The team turns on anomaly detection across every metric in the warehouse. Alerts arrive in Slack at a rate of 200 per day. By week three, the channel is muted. By week six, alerts are routed to email and ignored. By month three, the feature is quietly removed from the roadmap and the team concludes 'anomaly detection doesn't work here.'
The mistake is treating anomaly detection as something to apply broadly rather than narrowly. The metrics that matter are a small set — typically 30–80 in a mid-market business. Each deserves a thoughtful detector with seasonality, holiday, and campaign awareness. Spreading detection across every metric in the warehouse buys noise, not coverage.
The metrics worth monitoring are the ones operators would call about at 7am
We build the monitored-metric list with the operations and finance leaders who would receive a 7am call about the metric moving. If a 20% drop in metric X would trigger an executive call, X belongs in the monitored set. If a 20% drop in metric Y would prompt 'huh, weird' and nothing else, Y does not belong. The exercise filters the BI universe down to the operational essentials in about an afternoon.
Examples on a mid-market SaaS deployment: daily new-customer signups, paid-trial-to-conversion, gross MRR, churn dollars, support ticket volume by severity, infrastructure cost per active customer, top-funnel marketing CPL by channel. Roughly 35–60 metrics. Each gets a detector tuned to its expected pattern. The remaining 1,200 metrics stay queryable but unmonitored.
- Monitored metrics
- 35–80 curated, not all
- Alerts per week
- < 8 on a healthy week
- False positive rate
- < 12% after seasonality calibration
- Time-to-detection
- < 6 hours on critical metrics
Seasonality and known cycles are 80% of the calibration work
Most metrics have weekly, monthly, and annual cycles. Sunday traffic is lower than Tuesday. The first week of the month is heavier than the last for some metrics, lighter for others. Holiday weekends compress different metrics differently. A detector that flags every Sunday as anomalously low is broken before it ships.
We use seasonal-decomposition methods (STL, Prophet-style, or Bayesian structural time series depending on the metric) to separate trend, seasonality, and residual. Anomalies are detected on the residual, not the raw value. The residual is what surprises the model after accounting for what it already expected. Calibration discipline here is most of the project.
Holiday and campaign overlays are explicit, not implicit
The seasonal model can learn weekly cycles from history, but it cannot guess that yesterday was Black Friday or that the company ran a promotional campaign all of last week. Both belong in the model as explicit inputs — known holidays, known campaign periods, known product launch windows — so the detector knows the demand spike was expected.
We feed a holiday and campaign calendar to the detection model. Ops and marketing teams maintain the calendar; the analytics team consumes it. When marketing forgets to flag a campaign, anomalies fire and the campaign gets logged retroactively, which is also useful documentation. The discipline reinforces itself.
Explanation is the difference between an alert and an action
An alert that says 'metric X dropped 18%' is the start of a question, not an action. The detector should also explain: which dimension drove the drop, which segment is over-represented in the change, what correlated metrics also moved. Operators act on 'metric X dropped 18%, driven by a 34% drop in the SMB segment, which correlates with a 29% drop in the paid-search channel.' They don't act on the unexplained number.
Continuous KPI monitoring is a process, not a feature
The list of monitored metrics changes. Some metrics matter for one quarter and stop mattering. New metrics emerge as new programs launch. Anomaly detection that goes stale because the list was set once and forgotten degrades into noise. We review the monitored set quarterly with the leaders who own the metrics, prune the ones that no longer drive decisions, and add the ones that do.
The other reason for review: false positive rates drift as the underlying business changes. A detector calibrated for last year's seasonality fires wrongly during this year's growth. Calibration is reviewed alongside the metric list. The monitoring posture stays current because the people who care most about the metrics maintain it.
Routing matters: alert the person who can act, not the dashboard
Anomaly alerts that go to a generic Slack channel get ignored. Alerts routed to the metric owner — the person who actually owns the operational lever — get acted on. The routing has to be explicit per metric: marketing-CPL alerts go to the marketing director on duty; support-ticket-volume alerts go to the support manager; infrastructure-cost alerts go to engineering on-call. Each metric has an owner, and the alert lands in the owner's primary tool.
When the owner is unclear, the alert is unclear. We refuse to ship a monitored metric that doesn't have a named owner with an acknowledged response expectation. The discipline forces clarity about who is responsible for which lever. Often that conversation is more valuable than the detection itself.
The first month our team trusted anomaly detection enough to act on it without re-validating, I knew the calibration was finally right. Before that, every alert led to a 30-minute investigation that confirmed the alert was real but slow. Now alerts arrive with the dimension breakdown and we move directly to the conversation that matters.
— VP Operations, B2B SaaS client
Frequently asked
Why does anomaly detection fail in most BI deployments?
Because teams turn it on across every metric in the warehouse, get hundreds of alerts a day, mute the channel, and conclude the feature doesn't work. The fix is curation: monitor the small set of metrics that drive decisions, calibrate seasonality and known cycles, and route alerts to the person who can act. Spreading detection across every metric buys noise, not coverage.
How many metrics should be monitored?
Typically 30–80 in a mid-market business. The right list is the metrics for which a 20% move would trigger an executive call. Build it with the operations and finance leaders who own the levers. The remaining metrics stay queryable in the warehouse but unmonitored. The discipline of curating the list is most of the project's value.
How is seasonality handled?
With seasonal-decomposition methods — STL, Prophet-style, or Bayesian structural time series, depending on the metric. Anomalies are detected on the residual after trend and seasonality are removed, not on the raw value. Holiday and campaign overlays are explicit inputs so the model knows the spike was expected. Calibration discipline accounts for roughly 80% of the work.
What makes an alert actionable instead of noise?
Explanation. The alert states the metric move, the dimension that drove it (segment, channel, geography), the adjacent metrics that also moved, and a suggested investigation path. Operators act on explained anomalies and ignore unexplained ones. The detector that says 'metric X dropped 18%, driven by SMB paid-search, correlated with a CPL spike on brand keyword' is the one that earns the seat at the table.
How does the system avoid alerting on expected events like promotions?
Holiday and campaign calendars are explicit inputs to the detection model. The ops and marketing teams maintain the calendar; the analytics team consumes it. When a campaign isn't flagged, anomalies fire and the campaign gets logged retroactively — which is also useful as documentation. The discipline reinforces itself.
Who should receive anomaly alerts?
The owner of the operational lever for the metric. Marketing-CPL alerts go to marketing leadership; support-volume alerts go to support management; infrastructure-cost alerts go to engineering on-call. Routing is per metric, with a named owner and an acknowledged response expectation. Alerts to a generic channel get ignored. Alerts to the right person get acted on.
More from Field Notes
All essays
BI & Analytics Metrics that survive the boardroom: building a semantic layer your CFO will defend
A governed semantic layer is what separates a dashboard from a debate. Architecture, governance, and the lineage your CFO needs.
BI & Analytics Probabilistic forecasting beats single-line projections in board decks
Why boards make better decisions with probabilistic forecasts — confidence intervals, scenario sensitivity, and the analytics architecture behind them.
BI & Analytics Embeddable dashboards in Slack, ERP, and customer portals — without governance loss
How to embed BI dashboards in Slack, ERP modules, and customer portals while preserving the semantic layer, access control, and audit trail.