Your sales model predicts almost to the dollar. It has no idea what happens if you cut the ad budget.
You contract a data science team, and they just shipped a sales model. It's trained on three decades of clean, complete data: every ad dollar spent, every site visit, every sale closed. Backtests are spectacular and it predicts next week's sales almost perfectly. An R² of 0.95.
So you ask it the only question you actually care about: what happens to sales if we cut the ad budget in half?
The model's answer: nothing. Not "a little." Not "it's uncertain." A confident, precise 0.00. Move the budget up or down as far as you like, your near-perfect model insists sales won't budge.
This isn't a bug. It isn't bad data. It's what the best predictive model in the world will do, every time, on one of the simplest cause-and-effect chains in business. Here's why, and why more data does not help.
Everyone in your company knows how advertising works. You spend more, more people show up, and more of them buy. Cause, then effect, then effect. In the real world it's often more complicated, but the aim of this article is to show how such a simple problem of just A→B→C is unsolvable for standard predictive AI/ML. In this case it's Ad-Spend→Traffic→Sales.

Now here's the setup that breaks everything. Your team, doing the seemingly obvious and definitely standard thing: feeds the model both ad spend and traffic and asks it to predict sales. Why wouldn't they? More features, better predictions. And it works - the predictions are stellar. But include Ad-Spend and Traffic and your near-perfect accuracy predictive model insists revenue won't budge when you change Ad-Spend.
So the model predicts well - really well. But it also thinks that Ad Spend has no effect on Sales, even when it does.
Remember, again, you have decades of high quality data, and the most advanced predictive model.
Think about what the model sees. To predict sales, traffic is all it needs: traffic sits right next to sales in the chain. And once the model knows traffic, ad spend tells it nothing new. Every scrap of information spend carries about sales already traveled through traffic and arrived first.
So the model learns the efficient thing: read traffic, ignore ad spend. And judged purely on prediction accuracy - the only thing it was ever asked to do - that's not a mistake. It's the correct answer to the question it was asked.
The problem is you're about to ask it a different question.
A forecast. The model watches the world and guesses what comes next. Traffic is a legitimate clue.
An intervention. You're not watching the world anymore, you're reaching in and moving a lever.
When you drag the budget slider in a "what-if" tool built on this model, you're changing an input the model learned to ignore. Spend moves, traffic stays frozen at whatever value is plugged in, and predicted sales don't twitch. In the real world, cutting the budget would cut traffic, which would cut sales. Inside the model, the middleman never gets the memo.
The better the model predicts, the more confidently it tells you your levers do nothing.
This is the part that surprises even well-trained data scientists who are used to prediction, but not trained in the scientific method. It's actually a question I get a lot from data scientists following talks on causal inference. Surely with enough data, the model figures it out?
But the answer is a resounding "No". No amount of data will make a predictive model learn the correct answer to the right question.
In fact, more data makes the model more certain that ad spend doesn't matter, because in every single row, traffic already explains sales. A fancier architecture doesn't help either. You can't buy your way out of this with scale, compute, or a better vendor. The model is answering the question it was trained on, flawlessly. It was just never the question you needed answered.
And this trap isn't exotic. It's hiding in plain sight for any business decision-making problem.
And this is just for a simple three-variable problem. Most problems are more complex and have more causal pitfalls that fool predictive models into giving confidently incorrect answers, with high accuracy.
Spend plus traffic to sales, in a single equation. Accurate, but the chain is gone. Move spend and nothing propagates.
Spend to traffic and traffic to sales, each learned separately. Move spend, and the effect flows through the chain, just like reality.
Same data. Same three columns. But now, when you drag the budget slider, the model does what the world does: traffic responds, and sales respond to traffic. It predicts about as well as before, and it can finally answer the question you actually had. Forecasts and what-ifs, from one model.
This is what we do at Zebra. We build structural causal models that learn how each of your variables drives the next, so the model traces a change through the chain instead of collapsing it into a single shortcut to the outcome. Your team can test a budget shift, a price change or a policy tweak before betting real money on it.
Your model isn't lying to you. It's answering a question you didn't mean to ask.