Can we talk about "data"?

"Information", "files", "databases", "documents", "work" - whatever you refer to it as "DATA" is the catch-all, it's over-used, over-simplified and largely not well understood.
Most of us think and talk about it like it’s a neat, tidy thing we’ve got boxed up somewhere... it's not!
Over the past decade or so there’s been a real push for proactive data strategies. Not just to tick compliance boxes, but to defend against cyber threats, improve governance, and give businesses a clearer picture of what they’ve got and where it lives. It wasn’t perfect, but it felt like we were moving in the right direction... the era of "big data" and "business information dashboards"
Well… we were.
The ‘Old World’ of Data
For most of the last 10 years or so, data has lived in the usual mix:
- File shares, tucked away in business repositories, whether that's on premise or in Google / 365 / Sharepoint / Dropbox
- E-mails / comms in a combination of 365 / Google / Slack / Teams / Skype :) etc
- SaaS apps, each holding their own slice of the pie.
It’s fragmented, yes, but the picture was mostly understood. Businesses knew where their data was, albeit at a functional level, who owned it, and (roughly) how it was being handled. Governance and strategy was catching up... Not the glamorous or sexy side of technology, but important progress.
Then AI Walked In
Strategic AI projects factor in that without clean, well-governed data, it’s garbage in, garbage out. The right architecture, security, and processes can make AI a genuine competitive advantage.
The trouble is, that’s the strategic use.
The tactical reality — GenAI tools, shadow AI, rushed or poorly planned implementations — is that they’re forced to work with what’s already there: fragmented, segregated, duplicated data. And when that happens, we start to lose: Governance, Control & ownership and Visibility.
Getting the bottom of the detail around what's happening with our data in these scenarios would require advanced degrees in computer science, after you'd found the relevant information in the Ts & Cs using your law degree.
The Transparency & Consensus Gap
If we had a global consensus on data regulation this wouldn't be nearly as much of an issue, but countries are moving in different directions, with different priorities, and little sign of alignment.
In the absence of this, transparency around data handling and flexibility in deployment is key. Some providers are getting this right — offering clear explanations of where data resides, processing models you can actually understand, and options to regionalise hosting. But they’re the minority. Most are still opaque at best.
Shifting the Onus
Vendors should be making this simple:
- Where is your data stored?
- Where is it processed?
- Who can access it?
- Here are location options that might suit you better
Let's be realistic though, it's not in their interests. It's complex, potentially damaging to the bottom line and can impact on the ability to benefit from the aggregation of that data.
It's up to the consumers to shift from being a passive subscriber, to ask awkward questions, not only as responsible custodians but to ensure strategically you can benefit from what's emerging in the new AI world.
The Radical Option – Back to the Future
One route to this is for the ownership and control of the data to never leave you in the first place! It is your IP after all!
Instead of sending your data into someone else’s system, services would integrate into your environment. You’d know exactly where the data lived, how it was governed, and what was being done with it. You’d have a single source of truth, the dream!
The benefits are obvious, but there are very significant challenges which in practice make this incredibly difficult to realise; costs, data compatibility / interoperability, expertise to name a few.
The thing is, this should sound familiar, it's the world we left behind with the advent of SaaS, the "software-only” model we used to all operate under and have spent 15-20 years moving away from.
Food for Thought
Making the most of "AI" is going to be vital for all of us, but we're right at the start of the journey. There are plenty of gains that can be had now, but if we're not careful around the data aspect of all this not only will those gains not come but worse, we could be getting steered in the wrong direction by "AI" and irrevocably losing control and exposing our data in the process.
We really do need to talk about data.
Thoughts? Let me know - [email protected]