Operations2026-04-22

AI assortment planning: the highest-leverage decision in retail

Assortment is set once a season and lives 6–9 months. No in-season tuning recovers a bad pre-season call — which is exactly where most assortment AI fails.

Kevin Didelot11 min read

In every chain I've worked with, the same asymmetry holds. A pre-season assortment decision lives for 6 to 9 months. An in-season replenishment decision lives for two weeks. Yet most of the data investment, most of the ML talent, and most of the steering attention go into the second one. The first gets a buying committee, a spreadsheet, and a year of hindsight.

That asymmetry isn't just a misallocation of effort. It's the structural reason a lot of retailers feel like their data stack is "working" — markdowns optimized, replenishment automated, transfers fluid. And yet the season still ends with the same surstock pile and the same margin gap as the year before.

Because every in-season optimization, however good, operates inside the box assortment set. You can rearrange the boxes. You can't change which boxes were bought.

This article looks at pre-season assortment as it actually is: the highest-leverage decision in retail, the place where ML should produce the most value. And the place where ML most often produces a beautiful model with a 15% adoption rate. The reason isn't algorithmic. It's that the people who own assortment — merch directors, buyers, category managers — are right to ignore recommendations that don't carry their constraints.

Why assortment sits at the top of the decision hierarchy

Retail decisions form a hierarchy, and assortment sits at the top. Buying decides the universe of what a chain can sell next season. That universe has four dimensions: width (how many families, sub-families, brands), depth (how many references per cell), volume per reference, and distribution per store cluster. Every other operational decision downstream — pricing, replenishment, allocation, markdowns, transfers, end-of-life — works on the universe assortment defined.

This hierarchy has a brutal consequence. A 10% improvement in markdown logic recovers a few margin points on the products you bought wrong. A 10% improvement in pre-season assortment changes which products you bought in the first place. The second compounds across every downstream decision. The first doesn't.

And yet pre-season decisions are made in conditions that would horrify anyone used to operational analytics. A buyer commits, six to nine months in advance, to a volume on a reference whose real demand they can only guess at. That commitment lives inside the open-to-buy envelope set by the merchandise financial plan. The inputs are last season's sell-through, the supplier's collection narrative, a few trend reports, and the buying committee's collective intuition. Once committed, the volume is locked.

No agility in the world recovers a structural over-buy or under-buy. The markdown will hit the over-buy, the stockout will hit the under-buy. Both will be visible in the P&L six months later.

That's the leverage point. That's also why the ML opportunity here is enormous — and why it's almost always botched.

The pattern of failed AI assortment projects

The pattern is consistent enough that I can describe it without naming anyone. A chain invests in an AI assortment planning project. The data science team builds a demand forecasting model and layers an optimization engine on top. It produces a recommended pre-season buy: which references, how many units, distributed how across the store network.

The model is technically excellent. Forecast accuracy on the holdout set is strong. The optimization respects the global open-to-buy.

The recommendations are presented to the buying team a few weeks before the next collection commitment. And then, the same scene repeats. The buying director opens the recommendation, scans it, and starts the same set of objections.

"This volume on brand X assumes a contract we don't have." "This depth on the entry-price cluster ignores that the supplier won't deliver below 5,000 units per SKU." "This recommendation cuts the brand-Y exclusivity SKUs we negotiated three months ago — those are non-negotiable." "You've over-indexed the technical sub-family because last season's sell-through was inflated by a competitor's stockout, not by real demand."

Two or three meetings later, the team agrees to "use the model as input." In practice, the buying director glances at it, takes the two or three lines that confirm their existing view, and ignores the rest. Adoption rate on the actual buy, measured a year later, lands somewhere between 12% and 25%. That's the typical range for assortment AI in chains where the model wasn't designed around buyer constraints.

The model still runs. The dashboard still gets shown in steering committees. The buying committee still meets in a windowless room with a spreadsheet.

The failure isn't technical. It's that the model never integrated the constraints that actually determine the decision. And those constraints aren't a small adjustment layer — they're the bulk of the decision.

What the model didn't know

When you pull apart what a buying director carries in their head during a pre-season committee, you find five categories of input the data system almost never has.

Supplier contracts and minimums

Every category is bought against a web of supplier agreements. Think minimum order quantities per SKU, minimum total volume per supplier, exclusivity windows, payment terms tied to volume, returnable percentages, late-delivery penalties. A recommendation that proposes 800 units of a reference whose MOQ is 5,000 isn't a useful recommendation — it's an unrecognized constraint violation. Multiply that across a 200-supplier portfolio and the model's output stops being a buy plan.

Vendor calendar constraints

Collections don't arrive on a flat calendar. Some suppliers ship in two drops, some in four, some have a 14-week production lead time on top of a fixed showroom window in which the commitment must be made. A recommendation that ignores when the commitment must be locked — versus when the demand signal becomes readable — is operationally meaningless. Buyers don't ignore the recommendation because they distrust ML. They ignore it because it proposes a decision they couldn't execute even if they wanted to.

Category role logic

Every category in the assortment plays a role: traffic-driver, margin-builder, signature, fill. A category designated as traffic-driver isn't bought on margin maximization — it's bought to defend a price point or a position. A category designated as signature isn't bought on volume — it's bought to anchor the chain's positioning. Sector specifics sharpen this further, as the case of AI assortment and allocation in beauty retail shows. A model trained to maximize a global objective function will systematically over-buy margin-builders and under-buy traffic-drivers, because its objective doesn't know that categories play different roles.

Negotiated commitments

Half of a buying year is spent in negotiation. "We commit to 12,000 units on this collection in exchange for exclusivity on three references and a 3% extra discount." "We take 60% of last season's volume on this supplier to preserve the relationship, even if the model says 40%."

These commitments aren't documented in any structured system. They live in the buying director's notebook, in emails, in the relationship. A recommendation that contradicts them isn't wrong — it's just not actionable, because the actionable space was already partially locked.

Merch intuition that compresses real signal

Experienced merchandisers carry pattern recognition no model trained on five seasons of data can match. That brand X has been weakening for two seasons even though the numbers haven't fully caught up. That a sub-family's growth was inflated by a one-off campaign.

That a new entrant in a category is about to change the price floor. These aren't superstitions. They're compressed signal from a much longer time horizon than the model's training window.

A model that treats these five categories as filters to apply after generating a recommendation is doomed. By the time the filters run, the optimization has already over-allocated to volumes that violate them. And the residual plan looks nothing like the recommended plan — at which point the buyer reasonably concludes the model wasn't helpful.

Constraints as first-class inputs, not after-the-fact filters

One architectural choice separates assortment AI that gets adopted from assortment AI that doesn't. It comes down to this: are business rules and merch intuition inputs the optimization sees, or filters applied to its output?

Business rules and merch intuition handled two ways in assortment AI. As after-the-fact filters — the model optimizes, then strips out whatever violates a rule, leaving an incoherent remainder buyers archive (~15% adoption). As first-class inputs — the model knows the MOQs, exclusivities and width targets before it solves, so the plan is executable by construction, and adopted.

When constraints are inputs, the model knows that brand X has an MOQ of 5,000 before it allocates volume. It knows the supplier-Y exclusivity SKUs are locked at a specific depth before it solves for the rest. It knows the traffic-driver category has a width target independent of margin. It knows the buying director flagged a sub-family as overstated last season. The recommendation it produces is, by construction, executable.

When constraints are filters, the model produces a mathematically optimal plan and then strips out everything that violates a rule. What's left isn't a coherent plan — it's a remainder. Buyers can feel the difference instantly. The first kind of recommendation reads like a colleague who understands the trade-offs and is proposing a defensible plan. The second reads like a vendor pitch ignoring how the business actually works.

This is not a small architectural distinction. It's the difference between a model that gets used and a model that gets archived.

It also imposes specific things on the platform. The rules layer must be declarative (the merch team can express a new supplier minimum without an IT ticket). It must be contextual (a width rule that depends on cluster and brand, not a single global value). It must be hierarchical (when two rules conflict — a margin floor and a width target — the system arbitrates explicitly, not silently). And it must be observable (the buyer can see which constraints shaped a specific recommendation, and which were relaxed).

Without those four properties, business rules degrade back into a filter — and the project degrades back into the same 15% adoption story.

What changes when assortment AI is built this way

The visible change is that buyers stop bypassing the system. The deeper change is that the buying committee shifts from a debate about volumes to a debate about constraints.

The committee stops arguing whether 8,000 units of a reference is the right call. Instead it discusses whether the supplier minimum should be renegotiated, whether the category role should shift from margin-builder to traffic-driver, whether last season's sell-through on a sub-family should be downweighted. These are the conversations that actually move the season. The volume number falls out of the constraints once they're explicit.

This is a more productive use of the buying committee's time. It's also a more durable form of organizational learning — every decision modifies the constraints, every constraint is versioned. And the next season's plan starts from a richer base than the spreadsheet that opened the previous one.

The other thing that changes is the relationship between pre-season planning and in-season operations. When the pre-season buy was made with the executable constraints in the model, the downstream systems — replenishment, allocation, markdown — inherit a coherent plan they can operate on.

When the pre-season buy was made by a buyer overriding a model they didn't trust, the downstream systems inherit a plan that exists nowhere structurally. It lives only in the buyer's notebook. Every downstream model then has to re-infer it from sell-through data. The cost of that re-inference, season after season, is one of the quiet drivers of in-season inefficiency.

The Solya angle

This is the logic Solya is built around. Not an assortment optimizer with a settings tab for business rules. Instead, a decision platform where the chain's constraints are first-class inputs to the optimization. Those constraints are supplier minimums, vendor calendars, category roles, negotiated commitments, and the merch team's intuition expressed as adjustments to the demand signal.

Concretely, on a specific recommendation, a buying director can see which rules are active and which constraints were binding. They can see which were relaxed and by how much, and which merch judgments shifted the underlying forecast. It means a new supplier agreement signed in March is reflected in the April recommendation without an IT release. And it means the buying committee debates the constraints, not the model's outputs — which is the only conversation worth having at that altitude.

The result isn't a higher forecast accuracy. It's an adoption rate compatible with the leverage of the decision being made. A pre-season buy that the buying team actually owns, that the downstream systems can operate on coherently. And that compounds into the next season's plan rather than dying in a buyer's override.

The question to sit with

If you run, or sponsor, an assortment AI project: when your buying team rejects a recommendation, what reason do they give? If the answer is "it's wrong", the conversation is about model quality. If the answer is "it ignores how we actually buy", the conversation is about architecture — and no amount of model tuning will close that gap.

Most pre-season AI projects today are stuck in the second conversation while the steering committee is having the first. That misalignment is why so many of them are quietly described, after eighteen months, as "a useful input the team consults occasionally". Which is the operational definition of a project that didn't change anything.

For the adoption and KPI questions merchandising teams face, see our Merchandising Director FAQ.

Is your pre-season AI actually being used?

At Solya, we offer merch and buying leadership a personalized 30-minute diagnostic. On your own category mix and buying calendar, we assess the gap between what your assortment AI recommends and what your buying team actually commits to. And we identify the constraints that need to move from filter to first-class input before the next collection.

You'll walk away with:

A measurement of your current adoption rate on AI assortment recommendations
A map of the constraints your current model treats as filters rather than inputs
The first architectural changes to make before the next pre-season cycle

Kevin DidelotCo-founder & CTO, Solya

Co-founder & CTO of Solya.

Operations2026-05-12