Diagnostic2026-06-02

Generative AI in retail: where it helps, where it's hype

Generative AI is transforming retail's language tasks — search, content, copilots. It is not what decides a markdown or a replenishment. The two get conflated.

Kevin Didelot11 min read

"Generative AI" is the phrase that ate every retail technology conversation in the last two years. It is in every vendor deck, every board agenda, every analyst report. And like every phrase moving that fast, it has started to mean both everything and nothing — which is a problem, because the budgets attached to it are real.

This is a diagnostic, not a takedown. Generative AI is a genuinely powerful technology that is already creating value in retail. But it creates that value in specific places, for specific reasons — and it is being sold for a set of jobs it was never built to do. The gap between those two matters for where you spend, what you expect, and which problems you are actually solving.

Here is the honest state of generative AI in retail: what it is, where it helps, where it's theatre, and the one line that tells them apart.

What "generative AI" actually means (and what it doesn't)

Generative AI is, at its core, a language and content engine. A large language model predicts the next token; an image model predicts the next pixel block. What they are exceptional at is producing fluent, plausible, context-aware content — text, images, code, conversation — from a prompt.

That capability is real and new. For the first time, software can draft a product description, answer a free-text question, summarise a 40-page supplier contract, or translate a planogram brief into five languages. The marginal cost is near zero. None of that was practical at scale three years ago.

What generative AI is not is a decision engine. It does not natively optimise a constrained objective. Ask an LLM "how many units of SKU 4471 should I send to store 12?" and it will produce a confident, well-written number. Behind it there is no model of the demand, the constraints, the supplier minimums or the margin floor. It generates the form of an answer, not a decision you can execute.

This is the distinction the market keeps collapsing. A system that writes fluent language about retail is a different thing from a system that makes the operational calls that run retail. Both are "AI". Only one is generative. Confusing them is the root of most disappointed GenAI pilots.

Where generative AI genuinely helps in retail

Generative AI delivers where the output is language or content — where fluency, not numeric optimality, is the point. These are the genai retail use cases that are real and already in production.

Product content at scale. Drafting and localising descriptions, attributes and category copy for a 50,000-SKU catalogue — work that was previously a bottleneck of human copywriting.
Search and discovery. Turning a shopper's free-text or conversational query into relevant results, where the model's job is to understand language, not to decide stock.
Merchant and planner copilots. Letting a category manager ask, in plain English, "which stores are overstocked on autumn knitwear?" and getting a sourced, readable answer back — exactly the pattern in our merchant Q&A in Slack use case.
Summarisation and synthesis. Compressing supplier emails, market reports or a week of sales commentary into a brief a human can act on.
Internal knowledge access. Q&A over policies, planograms and process docs, so the answer comes to the person instead of the person hunting the wiki.

The common thread: in every one of these, the deliverable is language a human reads or content a customer sees. The model is doing what it was built for. The value is the fluency and the speed, and it is substantial — these are not toys.

Where it's theatre

The trouble starts when generative AI is pointed at jobs whose real output is a constrained numeric decision dressed up as a conversation.

"Ask the AI what to mark down." A markdown decision is an optimisation over price elasticity, remaining stock, margin floors and a regulated calendar. An LLM can narrate a markdown plan beautifully. It cannot guarantee the depth respects the floor, or that the timing clears the season — because it is not solving that problem, it is describing one. This is the gap we draw in full in prescriptive analytics in retail.

"GenAI-powered replenishment." Replenishment is a numeric loop over forecast, lead time and safety stock across tens of thousands of SKU/store pairs. Wrapping it in a chat interface changes the interface, not the engine. If there is no real optimisation underneath, the chat is lipstick on a spreadsheet.

"Autonomous GenAI agent for supply chain." Often this is rules plus an LLM wrapper: the language model generates the explanation while a thin script does the acting. It is the exact pattern we dissect in the honest state of the autonomous supply chain. The generative layer makes it sound autonomous; the decisioning underneath may not exist.

The failure mode is consistent. The demo is fluent, the slides are convincing, and the underlying decision quality is unverified — because the part everyone watched was the language, not the call. Generative fluency is very good at hiding the absence of a decision engine. That is precisely why BI-style dashboards never moved a unit of stock: a confident surface is not an executed decision.

The line: generative vs decisioning

There is one clean line that separates the real from the theatrical, and it is worth stating plainly.

Generative AI is for the language around the decision. A decision layer is for the decision itself.

A retail operation needs both, and they are complementary, not competing. The generative layer is the interface and the narration: it lets a human ask a question in English, reads back the reasoning, drafts the communication. The decision layer is the engine: it models the constrained problem, computes the action that respects the rules, and writes it back into the systems that execute it.

The strongest architectures use each for what it is good at. The decision is made by a system built to optimise under constraints — what we call operational AI. The generative layer sits on top, making that decision legible and conversational: explaining why a transfer was proposed, letting a merchant interrogate it, drafting the note to the store. The LLM is the spokesperson, not the brain. The deeper version of why the engine matters more than the narration is in from forecasting to decision: why ML isn't enough.

Used this way, generative AI makes a decision system dramatically more usable — and adoption is most of the battle. Used the other way, as a substitute for the engine, it produces confident, fluent, unaccountable guesses.

How to tell a real GenAI use case from a demo

Five questions cut through the fluency in any vendor demo.

"Is the deliverable language, or a number that executes?" If the output is copy, a search result or a written answer, generative AI is the right tool. If it is a quantity, a price or an allocation, ask what optimises it underneath.
"What makes the constraints binding?" A markdown that should never breach the floor cannot rely on the model choosing to respect it. Ask where the hard constraint is enforced — in an engine, or in the prompt and a hope.
"Where do the numbers come from?" A sourced answer cites the data it read. A hallucinated one sounds identical and is occasionally wrong in ways you cannot see. For anything operational, demand the source.
"Does anything get written back?" A copilot that tells a planner what to do still leaves the decision and the re-keying to a human. Ask whether the action reaches the ERP, WMS or pricing engine — and what makes it.
"What happens on a bad input?" Generative systems fail confidently. Ask to see it answer a question it has no data for — a vendor who won't show that has not productionised the failure mode.

None of these disqualify generative AI. They locate it. The good deployments answer them crisply, because they know exactly which half of the problem the LLM is solving.

Where this leaves you

Generative AI is one of the most useful technologies to reach retail in a decade — for the language and content half of the work. It drafts, it search-matches, it summarises, it converses. Pointed there, it pays for itself quickly and the upside is large.

That is exactly how Solya uses it. The decision is made by the intelligence layer and executed by the orchestration layer, both built to optimise and act under your business rules. The generative layer sits on top as the conversational surface. A merchant can ask, challenge and understand a decision in plain language, while the engine underneath actually computes and commits it. The fluency drives adoption; the engine drives the margin.

The question to sit with is not "are we using generative AI?" — you will be, and you should. It is "are we using it for language, or are we quietly asking it to make decisions it was never built to make?" The first is leverage. The second is theatre with a good script.

Sorting the generative hype from the decisions that move margin?

At Solya, we offer retail data and operations leaders a 30-minute diagnostic. We map, on your own context, where generative AI is the right tool and where the real job is a decision engine wearing a chat interface.

You'll walk away with:

A split of your AI roadmap into genuine generative use cases versus decision problems in disguise
The two or three language-layer wins worth shipping first
A read on where a decision layer — not a bigger prompt — is what actually closes the gap

Kevin DidelotCo-founder & CTO, Solya

Co-founder & CTO of Solya.

Diagnostic2026-05-21