actualidad

What Is Dark Data and Orphaned Data? A World of Light and Shadow

Boost6 min read
datosanalítica digitaldata-drivenprivacidadtracking

If you have ever wondered what dark data is, you have come to the right place. Data is a source of information (and light) that guides companies of all sizes towards better decisions and results.

But data also has a flip side. If not used correctly, it can end up costing any company dearly. In fact, poor data quality can imply a loss of profits of 25%. Yes — a quarter of what you earn.

Part of this less friendly side of data hides precisely in the shadows. In those data points that go unnoticed by most businesses and that, in general, no company is able to make use of. Data that moves in the shadows, always lying in wait.

In this article we will talk precisely about that data. About a type of data that, apparently innocuous, is hidden among all the information any company has at its disposal. And that data is the well-known dark and orphaned data.

What is dark data and why should you pay attention to it

You have probably heard of low-quality data. But understanding what dark data is goes beyond that. We are not only talking about visible errors, duplicates or inaccuracies — we are talking about data that hides among the information you already have and that you are not using or analysing.

What are "dark" and "orphaned" data?

We already know that many types of low-quality data exist. But in general, we tend to mean data that stands out and represents an obvious problem for companies. Inaccurate, incorrect, duplicated data... Data that is visible and gets in the way.

But "dark" and "orphaned" data are different. We could consider them low-quality data, but in reality they are data with "low visibility". Data that, for whatever reason, flies under the radar in your company's analytics. Here is what each one means:

Dark data

Data known as "dark", or dark data, are simply all those data points collected from different sources but not used. If you are still wondering what dark data is, you can think of it as all the potential knowledge that is in your system but does not form part of your analyses.

They may be actively ignored data or data that goes unnoticed without any awareness. This means they may have the potential to add to the company's overall strategy, or they may simply be consuming resources and unnecessary space.

An example: imagine your business is running a CRM strategy and sending various communications by email. As you know, CRM tools collect multiple data points, but it is up to you to integrate them into your analytics. If you decide not to use data like the open rate, you may be letting a few opportunities slip by.

Orphaned data

What we know as orphaned data is somewhat different. While it shares a visibility problem with dark data, its characteristics are different. In this case, it refers to data that is stored and considered in analytics but has lost its reference within the system and is no longer connected to the rest.

In other words: orphaned data is data that, for various reasons — such as a database format update or the deletion of certain information — has become completely disconnected from the rest of the information. It is still there, but nobody knows where it came from or what relationship it has with everything else.

An example: imagine your company collects information about your customers from different data sources. If you are storing data from one of those sources and, overnight, you decide to remove that data source, some data may lose its correlation with the rest and become useless.

Most common problems with dark and orphaned data

Having this data lurking in the shadows among all the information you have may seem harmless. If it does not mix with the rest and does not cause trouble, what is the problem? There can be several:

  • Unnecessary data storage – Storing and processing your data has a cost — not only economic but also in terms of storage space. Storing this shadow data means consuming some of your resources on something you do not use (and that you do not even know is there).

  • Time wasted searching for this data – Sometimes shadow data is genuinely useful for your business and your team needs it to make decisions. But if that data is disconnected from the rest of the system, you will need to devote a great deal of time and effort to finding it and re-integrating it.

  • Incorrect reports or altered values – Possibly the most serious problem. Even though they go unnoticed, dark and orphaned data are there. And your analyses may factor them in when performing certain calculations. This can lead to wrong or imprecise results.

  • Regulatory compliance issues – You know that regulations are becoming increasingly strict about data storage and processing. Shadow data can lead to non-compliance with these regulations and result in legal and economic problems.

The opportunities your business is missing because of poor data quality

You are probably wondering: am I really missing out on something important by ignoring this shadow data? Yes, you are. In addition to the problems above, this type of data can also mean you are not making all the decisions you should be.

  • Better data performance – If you have too much shadow data in your system, it is likely that loading and processing speed will be affected. So you may not have the best system running.

  • Ignored purchase signals – You know that a large part of your business's success depends on detecting purchase signals. If the data you store is not visible, you may be missing key conversion opportunities.

  • More precise decisions – Opportunities are in the details. And if part of your data is in the shadows, you may be overlooking personalisation, optimisation or error-correction opportunities.

  • A more reliable strategy – If your data does not reflect the reality of your business, neither will your strategy. The presence of dark and orphaned data can distort the overall picture of your business.

Bring your data to light and make the most of all its opportunities with Boost

The shadows of low-quality data are there, and we need to accept that. We also need to accept that it is very likely that we are storing dark data and orphaned data within our system. That is fine — it is just the reality.

The good news is there is a solution. At Boost we are experts at shining a light on those corners where abandoned data hides. We know how to navigate the dark corners of your analytics with ease.

Now that you know what dark data is, here is what we propose: we handle a thorough audit of your current analytics, identify all those data points slipping through the gaps in your available information and design a more precise and secure data collection system. Exactly what you need.

If that sounds like a good plan, write to us and let's get to work.

Related articles

What Is Dark Data and Orphaned Data? A World of Light a… | Boost