Why Your Data is Garbage

March 19, 2024

Blog

There is nothing peaceful about garbage. It stinks, it can be poisonous, and worst of all…it grows if ignored. If we hate garbage so much, why do we put up with garbage data?

Let me explain.

Why your data is garbage

There are lots of reasons why data can simply turn into garbage, but the most common reasons why is because it was either garbage to begin with or it rotted into garbage. We never like to think we make decisions based on garbage, but the adage ‘garbage in, garbage out’ really does apply.

Silos

Sometimes data is in different places. This is quite a natural phenomena as every company has their own CRM, ERP’s, another applications to manage. When your systems are configured correctly (and that is a BIG if), garbage can collect when your data is separated into silos:

Missing integrations: connections between applications might be poorly conceived or simply not present
No data lake or a simple data drought: data might not be stored in a data lake. It could be an infinite web of SPREADSHEETS that are saved on thumb drives of hundreds of people (shudder). Even if data lakes are setup, the data from the source may not simply be reflected in the same way as they are in the UI of said source application.
Knowledge gaps: often underrated, people’s knowledge about data varies. You wouldn’t expect a marketer to understand the intricate debits and credits of the accounting world would you? But this issue can be be hidden in tribal knowledge of the past, technical prowess, or simply a lack of shared experience between team members.

Definitional Disparities

This one really gets me. So many teams get their data and processes right, but fail when it comes to interpreting and disseminating it. Key Performance Indicators (KPI’s) aren’t simply a function of convenience but business knowledge that is integrated with business operations. Only when you build your basic definitions (like ‘what is revenue’) does other things make sense (like ‘what is profit’).

Having a data analytics solution is helpful for disseminating information as well. Interactive tools like Power BI, Tableau, and Looker are staples for large organizations. However, smaller or more technically sound ones might be better off with some of the many other solutions out there.

Wait, these sound like me…

If you can relate to these challenges, you’re not alone. Data doesn’t clean itself you know! Below are some helpful ideas on what to do when your data is garbage.

Oskar the Grouch is a beloved muppet from Sesame Street. He also loves garbage!

When your data is garbage

Clean out the pipes

Data pipelines are essential in moving data from one application to another. Extract and Load (EL) procedures are helpful in managing this movement of data. How can you measure the cleanliness of your data pipelines? Check the logs of the EL tool your team utilizes (Informatica, Fivetran, AWS Glue, etc.) or get one. Keeping these logs might just save your bacon!

Frequent: data is moving as designed according to the specific cadence
Failure Free: error loading notifications setup
Completeness: ensure ALL of the desired data is being loaded. this can quickly be done by analysts and engineers working together

Get yourself a source of knowledge and truth

Data lakes are a commonly referred to as ‘the source of truth’ but truth can be a subjective thing sometimes. A team really needs data to be ‘the source’ with the correct ‘knowledge’ applied to it. Data Dictionaries are a perfect example of curated and indexed knowledge that can minimize knowledge gaps while growing your business.

You also might need to revamp your data lake entirely if the integrations are predicated on exclusivity and not a more open source framework (I’m looking at you Oracle).

Get help

All of this can’t be done by a single person, but a team can! Make sure you get the right people with the right skills to help you identify what the actual problem is and save you lots of money from taking unnecessary or counterproductive measures to fix a singular problem. We at Fischer Analytics deal with these challenges all the time. Working in a variety of subject matter in your business operations, we would be happy to provide a free consult.

‍