Operations2026-04-17

Demand sensing in retail: forecasting for decision-readiness

Forecast accuracy is the wrong KPI. A 92%-accurate forecast that lands after the replenishment cut-off is worth less than a 78% forecast that lands in time.

Kevin Didelot11 min read

For ten years, retail forecasting teams have been measured on a single number: accuracy. MAPE, WAPE, bias on the trailing eight weeks. Steering committees argue over whether the model has crossed 88% or 91%. New vendors get benchmarked on three test categories before any conversation about deployment. And every season, the team comes back with a model a few points better than the previous one.

And every season, the operational numbers — service rate, replenishment hit rate, end-of-season residual — barely move.

This is not a measurement artifact. It is the symptom of a structural mismatch between what the forecasting team optimizes and what the chain actually needs to decide. The industry has spent a decade refining the accuracy of forecasts that arrive after the moment the decision had to be taken. Or at a granularity that doesn't match how the action is executed.

The right KPI was never accuracy. It was decision-readiness.

Demand sensing — short-horizon, high-recency, decision-coupled forecasting — exists for exactly this reason. Not because long-horizon ML got better. Because the premise of weekly batch forecasting is incompatible with the continuous decision loops modern retail runs on. This article makes that case, and draws the operational implications for the forecasting team.

A broader version of this argument — why ML alone isn't enough, and what decision layer needs to sit on top — is laid out in from forecasting to decision. The present article is the next layer down, specific to the forecasting team's KPI and architecture.

Why accuracy as a KPI quietly stopped making sense

The accuracy KPI was inherited from the era when forecasts were used to set seasonal plans. In that world, a forecast was produced once per season, fed an open-to-buy, drove a one-shot ordering decision. The forecast and the decision happened at roughly the same cadence — slow, batched, committee-paced. Accuracy was a reasonable proxy: a better forecast meant a better seasonal plan.

That world is gone. The dominant retail decision loop today is not seasonal — it is continuous. Replenishment runs daily or several times per day. Markdowns are re-arbitrated multiple times per season, sometimes weekly on at-risk SKUs. Inter-store transfers, e-commerce allocation, fulfillment routing all happen on horizons measured in hours, not weeks.

When the decision cadence accelerates and the forecast cadence doesn't, accuracy as a KPI starts measuring the wrong thing. A weekly batch forecast computed Sunday night, propagated Monday morning, delivered to the planner Monday afternoon, has lost three or four days of signal by the time it's used. The two percentage points of accuracy gained over the previous model are dwarfed by the staleness. That's the gap between the data the model saw and the reality the decision actually has to address.

This is not a hypothetical. In most chains, the dominant source of forecast error at the moment of decision is not model error. It is staleness error — the error introduced by the gap between forecast generation and decision execution. And staleness isn't on the team's dashboard.

Decision-readiness: a definition the industry doesn't use yet

Let's define the missing KPI cleanly. A forecast is decision-ready when three conditions hold simultaneously at the moment a decision must be taken:

The forecast exists for the relevant horizon and scope.
It is at the granularity the decision is taken at — SKU/store/day, not category/week.
It is recent enough that the most recently observed signals are incorporated.

If any of the three fails, the forecast is not decision-ready, regardless of how accurate it would be on a backtest. A SKU/store/week forecast can't drive a daily store-level replenishment decision — the granularity is wrong. A Monday-morning forecast can't drive a Tuesday-afternoon transfer arbitrage that depends on Tuesday-morning sell-through — the recency is wrong. A forecast computed only for the top 20% of the assortment can't drive a long-tail re-buy — the scope is wrong.

Decision-readiness is a binary, decision-by-decision check. Either the forecast was usable at the moment of the decision, or it wasn't. The aggregate metric — what percentage of operational decisions had a decision-ready forecast available? — is what the forecasting team should be measured on.

In most chains we've looked at, this number is shockingly low. Not because the models are bad. Because the forecast architecture was designed for a different decision cadence than the one the operations now run on.

Why "more frequent forecasts" is the wrong fix

The reflex, once the staleness problem is named, is to run the forecast more often. Daily instead of weekly. Twice a day instead of daily. Some chains have gone to hourly batches on high-velocity SKUs.

This helps at the margin. It also misses the point.

Frequency is a symptom-level fix to an architecture-level problem. As long as the forecast and the decision are decoupled, increasing frequency just narrows the gap statistically. It doesn't eliminate it. And it multiplies compute cost, governance burden, and the risk of presenting a planner with two contradictory recommendations from two adjacent runs.

The right architecture is not forecast more often. It is forecast coupled to decision — the forecast is recomputed because a decision-triggering event arrived, on the scope the decision needs, with a freshness guaranteed by the trigger itself. The forecast becomes a function call inside the decision loop, not a scheduled job in a parallel pipeline.

This is what demand sensing, when the term is used precisely, actually means. Not "short-term forecast." Not "incorporates more recent signals." A forecast architecture in which the forecast is generated, on the granularity needed, at the moment the decision is about to be taken. The "sensing" is the coupling to the decision-triggering signal, not the recency of the input data per se.

What couples a forecast to a decision

Three architectural shifts separate a decision-coupled forecast from a batch one.

The same demand forecast, delivered two ways. Batch — runs on a clock, produces a fixed SKU-by-store-by-week cube, lands in a file or dashboard that goes stale. Decision-coupled — runs on an event (a replenishment cut-off, a markdown committee), produces only the slice the decision needs, delivered into the decision context at the moment it is computed.

First, the trigger. A batch forecast runs on a clock. A decision-coupled forecast runs on an event — a stockout alert, a replenishment cut-off approaching, a transfer arbitrage request from a store, a markdown committee opening. The trigger carries the scope and the horizon. The forecasting layer responds to it, not to a cron schedule.

Second, the granularity contract. A batch forecast produces a fixed cube — say, SKU × store × week, for the whole assortment. A decision-coupled forecast produces only the slice the decision needs, at the granularity the decision is taken at.

A replenishment cut-off for store 47 needs SKU/day forecasts for the next seven days on the SKUs in scope. Nothing more. The compute is bounded, the output is tight, the propagation is fast.

Third, the delivery surface. A batch forecast is delivered into a file, a table, a dashboard. A decision-coupled forecast is delivered into the decision context — the replenishment engine, the markdown arbitration screen, the transfer recommender — at the moment the decision is being computed. The forecasting team stops shipping numbers; it ships answers to questions a downstream system is asking.

When these three are in place, the accuracy debate gets smaller. A SKU/day forecast generated three minutes before the cut-off, on the exact SKUs in scope, the morning's POS already incorporated, will almost always beat the 92% weekly batch. Even if its backtested accuracy is a few points lower on paper. Because backtested accuracy isn't what the chain pays for. Executed-decision quality is.

What this means for the forecasting team

This isn't a request to throw away the existing models. The ML stack a chain has built — gradient boosting on demand, hierarchical reconciliation, intermittent-demand handling, promo lift modeling — all of that stays useful. What changes is how it's packaged and deployed.

Three concrete shifts:

The team stops shipping a weekly forecast cube and starts shipping a forecast service. It's an API the decision systems call when they need a forecast for a specific scope, horizon, and granularity. The forecast moves from a deliverable to a capability.

The team's success KPI changes. Not WAPE on the trailing eight weeks. The decision-readiness rate — the fraction of operational decisions, across replenishment, markdown, transfer, allocation, that had a decision-ready forecast available at the moment the decision was taken. This number is computed at the decision layer, not the forecasting layer. The forecasting team starts being measured on something it can only improve by working with the operations.

The team's roadmap changes. Less time on the next two points of accuracy. More time on the latency budget — how fast can a forecast for an arbitrary slice be computed and delivered into a decision context? And more time on the integration surface — how many decision systems can call the forecast service today, on what scopes, with what guarantees?

This is a real reorientation. It will be uncomfortable for teams whose performance reviews still cite WAPE improvements. It is also the only path that turns a decade of forecasting investment into operational performance the CFO can read on the P&L. That is what we see across the chains we work with.

The Solya angle: forecast as a decision-layer capability

This is exactly the logic Solya's decision layer is built around. The forecast isn't a separate pipeline producing files into the data warehouse. It is a capability the decision layer calls — at the moment a replenishment is being computed, a markdown arbitrated, a transfer scored. It responds on the exact scope and granularity the decision needs, with the freshest available signal incorporated.

The forecasting models — yours, ours, hybrid — sit behind that capability. They keep doing what they're good at: predicting demand. What changes is the contract they expose to the rest of the chain. A decision system asks for a forecast; the forecast layer returns one, fast enough and specific enough to be usable. The decision-readiness rate becomes measurable, and improvable, because the architecture supports it.

Chains that adopt this pattern stop debating accuracy and start debating coverage — which decision loops now have a decision-ready forecast available. And which ones still don't. That is a much more productive conversation. And it correlates with operational outcomes in a way accuracy never did.

The question to take back to the team

This reframing sits one layer below the broader closed decision-execution loop the best retailers are converging on. If you are also actively evaluating vendors, the demand planning software buyer's guide maps the same forecasting-first vs decisioning-first split onto the category. If you lead a forecasting team, or you're the operations leader the team delivers to, ask one question at your next review. For every operational decision the chain took last week, was a decision-ready forecast available at the moment of the decision?

Not "what was the WAPE?" Not "did the model beat the baseline?" The decision-readiness rate, decision by decision, aggregated to a single percentage. If it's below 60%, the next point of accuracy is the wrong investment. The architecture is.

The forecasting team has been doing skilled work in the wrong frame. Reframing isn't a critique of the team — it is a critique of the KPI the team has been handed for a decade. The teams that change frame first will be the ones whose forecasting investment finally moves the operational numbers. The teams that keep optimizing accuracy will keep wondering why their excellent models don't show up on the P&L.

Is your forecasting stack ready for the decision loop?

At Solya, we offer supply chain and merchandising leadership a personalized 30-minute diagnostic to assess, on your own assortment and decision cadence, how decision-ready your current forecasting stack is. And identify where decoupling between forecast and decision is costing measurable margin today.

You'll walk away with:

A decision-readiness map of your top three operational loops (replenishment, markdown, transfer)
An estimate of the staleness gap between your forecast cadence and your decision cadence
The concrete shifts to make your forecast layer callable by your decision systems

Kevin DidelotCo-founder & CTO, Solya

Co-founder & CTO of Solya.

Operations2026-05-12