“It is a capital mistake to theorize before one has data.” Sherlock Holmes
This is a repost from my recent article for the Forbes Technology Council;
Data is one of the most valuable resources that can be acquired today. You’ve probably heard the phrase “Data is the new oil.” In fact, data has the potential to be much more valuable, and more versatile, than oil could ever be.
The most basic, fundamental use for data is to inform decision making. From a carpenter using measurements to decide where to make a cut to major corporations using complex market analysis to route millions of dollars of investment, data helps us understand
the world around us so we can make better choices and achieve our goals.
Today, thanks to the internet of things, there’s more data available than ever before, and businesses are able to do some incredible things with it. From implementing hyper-targeted and personalized marketing campaigns and improving customer retention to
streamlining supply chains and developing new products and services, big data enables organizations to grow and expand in many exciting ways.
How We Get Data
Data comes to us in many forms and from many sources, such as web crawlers, cookies, customer relationship management platforms, surveys and internal reports. However, data is only as valuable as what can be done with it. Like oil, data must be refined before
it can be used. The challenge is organizing it and making sense of it so that it can inform action.
The data from the sources mentioned above all has one thing in common — it’s structured. While there is a great deal of useful structured data available, the real trick is tapping into the potential of unstructured data.
As the name suggests, structured data has an inherent form and logic to it. This makes it easy to capture, organize in a relational database and analyze. Anything that is quantifiable and readily understood by machine language is most likely a form of structured
data. For example, dates, sales numbers, stock prices and geolocation data are all examples of discrete, organized, structured data points.
Capturing structured data is relatively simple, especially once a data model has been created. A data model determines the data fields and types of data (e.g., numeric, alphabetic, date, address, etc.), as well as any input restrictions such as character
limits, codes or abbreviations. With the model in place, structured data can be automatically captured by machines using sensors like barcode scanners, RFID tags, or digital bots and web crawlers.
Thanks to the internet of things, by 2025 there may be as many as 80 billion internet-connected devices collecting structured data, generating as much as 180 trillion gigabytes of data. That’s no small amount of data — yet it’s just a drop in the bucket
compared to the potential of the other kind of data.
The internet of things has led to an explosion of structured data in recent years, yet the vast majority of data cannot be easily stored and manipulated in relational databases. In fact, experts say that anywhere from 80% to 90% of all data is unstructured,
and the trick is making sense of it.
Unlike structured data, where the fields and types of data are all predetermined, unstructured data lacks a clear, defined format. It’s disorganized and difficult to collect, organize and analyze using conventional data tools. It may be paper-based, such
as letters or documents, or it could be digital, such as the contents of emails, video and audio files, and open-ended survey responses.
The key problem with unstructured data is that it lacks a defined model of interpretation and can’t be easily organized in a relational way like what SQL enables for structured databases. This makes it much more difficult to extract meaningful information
from unstructured data.
However, technology has now advanced to the point where we are beginning to be able to harness this previously untapped resource. High definition scanners using optical character recognition (OCR) and intelligent character recognition (ICR) technology can
accurately capture and digitize physical documents, and voice-to-text tools can transform video or audio recordings into text. Natural language processing systems using machine learning and cognitive automation are now capable of extracting meaning from a
much wider variety of sources.
This is a major opportunity for businesses to develop a competitive advantage. Not only is there significantly more unstructured data out there, but in many cases, it can provide a deeper understanding than the surface-level view offered by structured data.
The business world has once again struck oil. Now it’s just a matter of effectively refining it and putting it to work. Here’s how to get started:
1. Identify opportunities to integrate data.
If you’re not already leveraging data to direct decision making in certain areas of your business, now is the time to start. Consider how data could improve outcomes not only for sales and marketing teams, but for employee retention and productivity, facilities
management, risk assessments, and so many other aspects of your business.
2. Start with structured data.
As discussed above, structured data is easier to collect, analyze and understand. But just because it’s easy doesn’t mean it’s not extremely valuable. Develop a strategic approach to collecting and leveraging internal and external structured data sources,
and determine how the data will be stored, analyzed and put to use. Again, data is only as valuable as what you can do with it. Collect what you can, and share it as widely as is prudent so that internal data silos don’t stifle innovation.
3. Invest in tools to manage unstructured data.
Companies hoping to thrive in the new data-rich environment may want to start investing in the latest cutting-edge technologies being developed to capture and capitalize on the ocean of unstructured data that’s out there. Tools such as text-mining software
or customer service interaction analytics are a good place to start.