Data First, AI Second

Data First, AI Second

A few years back I watched a manufacturing company spend more than $1.3 million on an AI initiative while their data sat in twelve different systems that didn’t talk to each other. The pitch decks were beautiful. The data underneath was a mess, fragmented across legacy systems, half of it stale, none of it trustworthy. They wanted the AI transformation everyone was talking about, and they wanted to skip the part where you fix the foundation it has to stand on.

It didn’t work, and it was never going to. You can’t build a reliable model on data you can’t reach, can’t trust, and can’t reconcile across systems.

This is the least exciting thing I say to executives, and it’s the most important. Before the AI, you need clean, accessible data and some form of governance that keeps it clean. Both are boring. Both are the difference between a project that returns something and one that turns into an expensive science experiment.

It’s not a niche problem either. McKinsey, looking specifically at manufacturing, found that poor data quality is the consistent roadblock holding back the highest-value AI use cases . Legacy systems, weak governance, and data scattered across teams leave companies sitting on silos no model can use. The factory floor makes it vivid: missing events, sensor errors, inconsistent labeling. Feed that to a model and you get confident answers built on garbage.

Governance is the word that makes people’s eyes glaze over, so let me be plain about what I mean. It’s deciding who is accountable for whether a number is right. It’s knowing where a piece of data came from and whether you’re allowed to use it the way you’re about to. It’s running quality checks continuously instead of discovering the problem after the model already shipped a bad decision. None of that is glamorous. All of it is what separates data you can build on from data that only looks like a foundation.

So before chasing the next AI breakthrough, I’d ask a few unglamorous questions. Is your data clean, accessible, and actually integrated, or does it live in twelve systems like that manufacturer’s did? Do you have a governance setup that keeps it usable without locking it away? Can your teams keep the quality up, or does it decay the moment nobody is looking?

The unsexy work of getting your data in order pays the highest dividends of anything you’ll do with AI. The flashy tools only matter once there’s something solid underneath them. Get the foundation right and the AI part gets a lot easier. Skip it and no amount of model is going to save you.

Share

Get weekly insights on technology leadership

One idea per issue. No spam. Plus a free guide on measuring AI initiatives when the old metrics don't work.

Or download the free guide directly →