Observation

There is a moment in most strategy offsite workshops that nobody names but everyone recognises.

The leadership team has spent two days working through the company's Balanced Scorecard. The four perspectives are filled. The arrows are drawn between learning and processes, between processes and customers, between customers and financial results. The strategy map is complete, coherent, and visually satisfying.

Someone takes a photograph.

And then the team goes home, and the map goes on the wall, and the measures go into the dashboard, and the company proceeds — measuring its learning and growth metrics, its internal process scores, its customer satisfaction numbers, its quarterly financials — as if the arrows on that map were laws of nature.

As if the chain had been proven.

It hasn't.

Exploration

The Balanced Scorecard is built on a specific theory of how organisations create value. The theory goes like this: invest in employee capability and organisational learning, and you will improve your internal processes. Improve your internal processes, and you will create better outcomes for customers. Create better outcomes for customers, and you will generate superior financial results.

Four perspectives. Three arrows. One causal chain.

The theory is elegant. It is also an assumption.

When Kaplan and Norton introduced the Balanced Scorecard in 1992, they were solving a real problem: organisations were drowning in financial data and ignoring everything else. The past was overrepresented in management information. The future was invisible. The scorecard was designed to make the invisible visible — to surface the leading indicators that would predict financial outcomes before they arrived.

What they did not do was prove the chain was real. They proposed it. Illustrated it with case studies. Built a consulting practice around it. But the causal relationships embedded in every strategy map — the arrows that say learning causes processes, causes customer outcomes, causes financial results — were presented as self-evident logic, not tested hypotheses.

The relationships between BSC perspectives are logical, not causal. They make sense as an argument. They have not been demonstrated as a mechanism.

Danish management scholar Hanne Nørreklit noticed this in 2000. Her distinction is precise and worth holding: there is a difference between "this follows from that" and "this causes that." In mathematics, the first is all you need. In organisations — complex, human, contextually embedded — the second requires evidence.

The empirical researchers who followed Nørreklit into the data found something more complicated still. Some studies confirmed the chain. Others found it ran backwards. In one Southeast Asian service organisation, Granger causality testing revealed that revenue caused customer satisfaction — not the other way around. The organisation's financial strength was what enabled it to invest in customer experience. The arrow pointed the wrong direction.

For years, that organisation had been managed as if improving customer satisfaction would drive financial results. The data said the reverse was happening.

The one finding that is not ambiguous

Thirty years and over a thousand published studies later, the evidence on whether BSC-using organisations outperform their peers is roughly split down the middle. Positive findings. Negative findings. Roughly equal in number.

There is one finding, however, that stands apart.

Ittner and Larcker, in their landmark field research across sixty-plus companies, found that only 23% of organisations had built and verified the causal model underlying their performance measurement system. The other 77% had built a measurement system. They had not built a model.

+2.95%
Return on assets advantage for organisations that built and verified causal models — Ittner & Larcker, field research across 60+ companies
+5.14%
Return on equity advantage for the same group. The gap is not explained by the scorecard — it is explained by the act of hypothesis-building.

The organisations that did verify the model had, on average, a return on assets 2.95 percentage points higher and a return on equity 5.14 percentage points higher than those that did not. That gap is not explained by the scorecard. It is explained by the willingness to say: we believe this causes that, and we are going to test whether it does.

Most organisations never say that. They draw the arrows and move on.

Why the arrows don't get tested

The strategy map gets built in a workshop. Senior leaders — people who have spent careers in the organisation, who know it deeply, who carry strong intuitions about what drives performance — sit together and articulate what they believe. The map is a record of those beliefs. The arrows represent the causal logic the leadership team finds most compelling.

And then those beliefs get encoded as measurement architecture.

The metrics are chosen to track the assumed causal path. The dashboard reports progress along that path. The annual review checks whether performance against each perspective has improved. Nobody goes back to ask whether the path is real.

Partly because testing the causal chain is genuinely difficult. You need longitudinal data — years of it, not quarters — because the time lags between learning investment and financial outcomes are long and variable. You need to isolate the causal contribution of each perspective from the noise of everything else happening in and around the organisation.

But the hardest requirement is the one nobody mentions: the intellectual honesty to accept findings that contradict what the leadership team believed when they drew the arrows.

Discovering that the arrows are wrong is discovering that the understanding is wrong. That the decisions made on the basis of that understanding may have been wrong. That the measures tracked for years may have been tracking the right things for the wrong reasons, or the wrong things entirely.

Most organisations find it easier not to check.

And so the map stays on the wall. The metrics keep moving. The strategy review keeps reviewing. And the causal theory embedded in the original workshop — the one that was never a proven hypothesis, only a plausible argument — continues to govern the organisation's choices, unchallenged.

An organisation running a Balanced Scorecard without a tested causal model is not managing strategically. It is managing by convention. The four-perspective structure is real. The metrics are real. The review cadence is real. But the animating logic — the claim that these things cause those things, which cause those other things, which cause the financial outcomes we want — is, in most organisations, an article of faith.

The scorecard gave the faith a form. It did not make the faith true.

What would it mean to treat strategy differently? To hold the arrows on the map not as architecture but as hypotheses — provisional, testable, subject to revision when the evidence pushes back?

It would mean something uncomfortable: the strategy you built might be wrong. The capabilities you invested in might not connect to the customer outcomes you assumed. The customer outcomes you have been tracking might not precede the financial results you care about. The causal theory governing your organisation's choices might need to change.

It would also mean something valuable: you would actually know. Not believe. Know.

The organisations that escape the measurement trap are not the ones with the best dashboards. They are the ones willing to hold their own strategy up to the light and ask, with genuine curiosity: is this actually how it works?

That question is harder than drawing an arrow. It is also the only question that closes the gap between strategy and understanding.

What are the arrows on your map?
And when did you last check whether they were real?