MYOD Dataset: Building a DAM

Kshira Saagar

Kshira Saagar (the K is silent as in a Knight) is currently the Group Director of Data Science for Global Fashion Group (parent company of The Iconic) and has spent 19.7 per cent of his lifetime helping key decision makers and CxOs of Fortune 100 companies make smarter decisions using data. He strongly believes that every organisation can become truly data-driven, irrespective of their size or domain or systems.

In my first article in this MYOD [Make Your Organisation Data-Driven] series, I articulated a one-line approach to successfully injecting data into your organisation’s DNA: Using a Dataset -> Skillset -> Mindset framework. This will take your people and processes on a journey to data actualisation.  

The first and most critical stage of the framework is the Dataset stage. The term Dataset used here is a synecdoche or figure of speech standing for all aspects and processes for making data available, reliable and credible.  

And it is Data Availability Maps (DAMs) that will you help better understand the lay of the data land and resolve information conflicts to supercharge data-driven decisions.  

Why Dataset is the first stage  

Have you ever been in a meeting where the simplest of questions seem to be the toughest to answer? A good one is, “What’s the number of new customers we have acquired in the last three months?”  

If you ask the tech team that managed the shop/point-of-sale database, you’d get one number – say 1000. When you ask the marketing team using a different set of tools, you’d get a slightly different number – say 980. Finally, when you ask the team responsible for customer experience, you’d get 920. Which one is correct and have you spent a lot of time debating these in crucial meetings?  

This ‘which-metric-to-trust’ debate is a key component of all big and small meetings alike and is better known as data confusion. This confusion happens in organisations of all sizes, due to many diverse systems capturing the same information in many different ways. It also leads to the famous “guess we can’t ever know the truth and can’t trust the data” resignation, which is the biggest and most dangerous bottleneck to data-driven decision making in an organisation.  

A second big hurdle is GIGO (Garbage in, garbage out). This implies if the data is unreliable, then any super-smart insight or algorithm built on top of it will be unreliable by extension. Despite the most expensive artificial intelligence (AI) tools or software on the market doing what they do best, if the internal data landscape isn’t better mapped out and governed, it makes the whole process nullable.  

GIGO along with data confusion, makes it extremely important to first understand the data landscape before going any further on the data journey.  

Exploring the data seas  

Anyone who has ever used Marco and Polo in Age of Empires 2 will have seen the fog being unveiled and the entire lay of the land showing up. For those who haven’t, imagine the old explorers mapping out our oceans and producing the first Atlas to get a sense of the world. Either way, these actions shine a light on the unknown unknowns and give us the possibility of the unseen reality.  

The same exercise needs to be first done on the wider data and system landscape to get a sense for the data reality. In comes, a quick win solution, Data Availability Maps (DAMs).  

Good news is DAMs don’t need expensive tools or software. Just like any other sensible exercise, a DAM starts from a super high-level and iteratively dives deeper into finer aspects of each data source.  

While there are a lot of really good tools to do the finer details, no ‘one tool’ can claim to do a super high-level overview of all your systems. The good fortune to do that still lies with the organisation.  

To build a DAM, all you need is a digital spreadsheet that can address the following questions:  

  1. What is the ‘name’ of the data source?
  2. Where does the source store this data in the end? (Hint: Cloud is not an answer)
  3. What aspects and features does it track? Is it manual entries or automatic?
  4. If the accuracy could be benchmarked, how reliable are these tracked metrics?
  5. What business questions can these metrics and data sources answer?

The first pass should yield a result to answer the first two questions: What is the data source ‘called’ and where does the data source store the data. It is important that apart from noting what the data source does, it is imperative to also get its pet name like ‘Alpha’ or typically an animal or a mythical hero’s name. So many confusions and arguments at cross-purposes would be resolved if we just called the data sources by one agreed name first.  

First draft of a DAM  

A fully-formed first draft of the DAM should look something like this:

 

 

 

A few things become apparent from this exercise:  

  • Which aspects are tracked more than a few times across various systems – it’s good to know, for instance, if customer details are tracked in three different systems already
  • Which system is the most reliable for what kind of metrics. From the table, we can see customer details is more reliable on the ‘Shop Transaction Database’, whereas email subscriptions is more reliable in ‘Marketing DB #2’
  • Which aspects are trackable but NOT tracked fully or not tracked at all in these systems - basically the whitespaces in the systems that can be exploited.  
Benefits of DAM  

Aspects tracked in a DAM are still high-level features and not exact metrics in themselves. Think of these aspects as fundamental blocks that when put together, provide a complete picture of what’s going on with the organisation.  

Apart from the obvious visibility of data sources available, this exercise also serves to highlight the glaring gaps in the data landscape based on the kind of questions the business wants answered and the data is currently unable to sufficiently answer.  

The DAM exercise doesn’t have to be a technical team exercise. Anybody and everybody who cares to make data-driven decisions can build one on their own and merge all their findings together with another interested data soul. The technical aspect of where the data is stored only plays a very crucial role when conversations are had with vendors and third-parties, as it greatly minimises time spent going back and forth with IT/tech teams and gives everyone involved a quick picture of what is doable.  

A fully formed DAM  

The DAM above is only a first draft that can be done by anyone in the organisation, irrespective of technical skills. However, to put this into action, more specific tools are needed – known in the market as ‘Data Catalog’ tools – which come in both open-source and commercial versions. These Data Catalog tools can then take the DAMs into a totally different level by providing visibility on what is available and how accurate a specific metric is within a feature.  

For example, a Data Catalog can say which system captures the age and gender of a customer most accurately and comprehensively, while at the same time ensuring the right privacy and protection of this sensitive data is in place.  

A good first step towards having a better hold on your data governance and privacy framework is to ensure a quick DAM is built, as this helps answer business questions and also can morph into a sophisticated data catalog when the time is right.  

A DAM is still just the starting point of sorting out the data confusion and GIGO issues. Once a DAM is built, the next big issues with your dataset are on ways to sort out the right availability for the right people, a plan to fill in data gaps and more importantly, a smarter way to ask tougher questions of your data. We’ll cover more about these three aspects in the upcoming series of MYOD.  

Until then, for those who’d like to create a DAM of your own - here’s a template to get you started on your MYOD journey.

Tags: data analytics, big data analytics, data-driven marketing

Show Comments

Featured Whitepapers

More whitepapers

Latest Videos

More Videos

Great piece Katja. It will be fascinating to see how the shift in people's perception of value will affect design, products and services ...

Paul Scott

How to design for a speculative future - Customer Design - CMO Australia

Read more

Google collects as much data as it can about you. It would be foolish to believe Google cares about your privacy. I did cut off Google fr...

Phil Davis

ACCC launches fresh legal challenge against Google's consumer data practices for advertising

Read more

“This new logo has been noticed and it replaces a logo no one really knew existed so I’d say it’s abided by the ‘rule’ of brand equity - ...

Lawrence

Brand Australia misses the mark

Read more

IMHO a logo that needs to be explained really doesn't achieve it's purpose.I admit coming to the debate a little late, but has anyone els...

JV_at_lAttitude_in_Cairns

Brand Australia misses the mark

Read more

Hi everyone! Hope you are doing well. I just came across your website and I have to say that your work is really appreciative. Your conte...

Rochie Grey

Will 3D printing be good for retail?

Read more

Blog Posts

How to design for a speculative future

For a while now, I have been following a fabulous design strategy and research colleague, Tatiana Toutikian, a speculative designer. This is someone specialising in calling out near future phenomena, what the various aspects of our future will be, and how the design we create will support it.

Katja Forbes

Managing director of Designit, Australia and New Zealand

The obvious reason Covidsafe failed to get majority takeup

Online identity is a hot topic as more consumers are waking up to how their data is being used. So what does the marketing industry need to do to avoid a complete loss of public trust, in instances such as the COVID-19 tracing app?

Dan Richardson

Head of data, Verizon Media

Brand or product placement?

CMOs are looking to ensure investment decisions in marketing initiatives are good value for money. Yet they are frustrated in understanding the value of product placements within this mix for a very simple reason: Product placements are broadly defined and as a result, mean very different things to different people.

Michael Neale and Dr David Corkindale

University of Adelaide Business School and University of South Australia

Sign in