The simple task your AI can't learn

7 Jul 2026

Your sales model predicts almost to the dollar. It has no idea what happens if you cut the ad budget.

You contract a data science team, and they just shipped a sales model. It's trained on three decades of clean, complete data: every ad dollar spent, every site visit, every sale closed. Backtests are spectacular and it predicts next week's sales almost perfectly. An R² of 0.95.

So you ask it the only question you actually care about: what happens to sales if we cut the ad budget in half?

The model's answer: nothing. Not "a little." Not "it's uncertain." A confident, precise 0.00. Move the budget up or down as far as you like, your near-perfect model insists sales won't budge.

This isn't a bug. It isn't bad data. It's what the best predictive model in the world will do, every time, on one of the simplest cause-and-effect chains in business. Here's why, and why more data does not help.

A three-link chain

Everyone in your company knows how advertising works. You spend more, more people show up, and more of them buy. Cause, then effect, then effect. In the real world it's often more complicated, but the aim of this article is to show how such a simple problem of just A→B→C is unsolvable for standard predictive AI/ML. In this case it's Ad-Spend→Traffic→Sales.

The chain every operator already understands: ad spend drives traffic drives sales.

Now here's the setup that breaks everything. Your team, doing the seemingly obvious and definitely standard thing: feeds the model both ad spend and traffic and asks it to predict sales. Why wouldn't they? More features, better predictions. And it works - the predictions are stellar. But include Ad-Spend and Traffic and your near-perfect accuracy predictive model insists revenue won't budge when you change Ad-Spend.

So the model predicts well - really well. But it also thinks that Ad Spend has no effect on Sales, even when it does.

Remember, again, you have decades of high quality data, and the most advanced predictive model.

The model found a shortcut. It's allowed to.

Think about what the model sees. To predict sales, traffic is all it needs: traffic sits right next to sales in the chain. And once the model knows traffic, ad spend tells it nothing new. Every scrap of information spend carries about sales already traveled through traffic and arrived first.

So the model learns the efficient thing: read traffic, ignore ad spend. And judged purely on prediction accuracy - the only thing it was ever asked to do - that's not a mistake. It's the correct answer to the question it was asked.

The problem is you're about to ask it a different question.

QUESTION 1 — ACED

"What will sales be next week?"

A forecast. The model watches the world and guesses what comes next. Traffic is a legitimate clue.

QUESTION 2 — FLUNKED

"What happens if we change the budget?"

An intervention. You're not watching the world anymore, you're reaching in and moving a lever.

Same model, two questions. It can ace the first and flunk the second.

When you drag the budget slider in a "what-if" tool built on this model, you're changing an input the model learned to ignore. Spend moves, traffic stays frozen at whatever value is plugged in, and predicted sales don't twitch. In the real world, cutting the budget would cut traffic, which would cut sales. Inside the model, the middleman never gets the memo.

The better the model predicts, the more confidently it tells you your levers do nothing.

More data won't save you. Better data won't save you. A bigger model won't save you.

This is the part that surprises even well-trained data scientists who are used to prediction, but not trained in the scientific method. It's actually a question I get a lot from data scientists following talks on causal inference. Surely with enough data, the model figures it out?

But the answer is a resounding "No". No amount of data will make a predictive model learn the correct answer to the right question.

In fact, more data makes the model more certain that ad spend doesn't matter, because in every single row, traffic already explains sales. A fancier architecture doesn't help either. You can't buy your way out of this with scale, compute, or a better vendor. The model is answering the question it was trained on, flawlessly. It was just never the question you needed answered.

And this trap isn't exotic. It's hiding in plain sight for any business decision-making problem.

And this is just for a simple three-variable problem. Most problems are more complex and have more causal pitfalls that fool predictive models into giving confidently incorrect answers, with high accuracy.

What to do instead

PREDICTIVE MODEL

Learns one shortcut

Spend plus traffic to sales, in a single equation. Accurate, but the chain is gone. Move spend and nothing propagates.

STRUCTURAL CAUSAL MODEL

Learns every link

Spend to traffic and traffic to sales, each learned separately. Move spend, and the effect flows through the chain, just like reality.

Same data. Same three columns. But now, when you drag the budget slider, the model does what the world does: traffic responds, and sales respond to traffic. It predicts about as well as before, and it can finally answer the question you actually had. Forecasts and what-ifs, from one model.

This is what we do at Zebra. We build structural causal models that learn how each of your variables drives the next, so the model traces a change through the chain instead of collapsing it into a single shortcut to the outcome. Your team can test a budget shift, a price change or a policy tweak before betting real money on it.

Your model isn't lying to you. It's answering a question you didn't mean to ask.

Everyone's claiming they drove the upside. Who's right?

Three teams each claim credit for a good quarter, and their claims add up to more than the gain. So who actually drove it?

They don't like a cold night in Stoke

We mapped a full Premier League season to separate the football 'truths' that decide results from the ones pundits just repeat.

Company Details

Zealers Limited
Ltd co no: 12900099
VAT number 358772844

Registered Address

Grenville House
4 Grenville Avenue
Broxbourne
Hertfordshire
EN10 7DH