There is nothing peaceful about garbage. It stinks, it can be poisonous, and worst of all…it grows if ignored. If we hate garbage so much, why do we put up with garbage data?
Let me explain.
There are lots of reasons why data can simply turn into garbage, but the most common reasons why is because it was either garbage to begin with or it rotted into garbage. We never like to think we make decisions based on garbage, but the adage ‘garbage in, garbage out’ really does apply.
Sometimes data is in different places. This is quite a natural phenomena as every company has their own CRM, ERP’s, another applications to manage. When your systems are configured correctly (and that is a BIG if), garbage can collect when your data is separated into silos:
This one really gets me. So many teams get their data and processes right, but fail when it comes to interpreting and disseminating it. Key Performance Indicators (KPI’s) aren’t simply a function of convenience but business knowledge that is integrated with business operations. Only when you build your basic definitions (like ‘what is revenue’) does other things make sense (like ‘what is profit’).
Having a data analytics solution is helpful for disseminating information as well. Interactive tools like Power BI, Tableau, and Looker are staples for large organizations. However, smaller or more technically sound ones might be better off with some of the many other solutions out there.
If you can relate to these challenges, you’re not alone. Data doesn’t clean itself you know! Below are some helpful ideas on what to do when your data is garbage.
Data pipelines are essential in moving data from one application to another. Extract and Load (EL) procedures are helpful in managing this movement of data. How can you measure the cleanliness of your data pipelines? Check the logs of the EL tool your team utilizes (Informatica, Fivetran, AWS Glue, etc.) or get one. Keeping these logs might just save your bacon!
Data lakes are a commonly referred to as ‘the source of truth’ but truth can be a subjective thing sometimes. A team really needs data to be ‘the source’ with the correct ‘knowledge’ applied to it. Data Dictionaries are a perfect example of curated and indexed knowledge that can minimize knowledge gaps while growing your business.
You also might need to revamp your data lake entirely if the integrations are predicated on exclusivity and not a more open source framework (I’m looking at you Oracle).
All of this can’t be done by a single person, but a team can! Make sure you get the right people with the right skills to help you identify what the actual problem is and save you lots of money from taking unnecessary or counterproductive measures to fix a singular problem. We at Fischer Analytics deal with these challenges all the time. Working in a variety of subject matter in your business operations, we would be happy to provide a free consult.