Organizations of all kinds are pursuing big data for one big reason: If they can extract the right data at the right time, they can use it for insights that open new ways to make money or ways to save money, like reducing their operating costs.
But organizations can’t just dive into the data. Data is everywhere today, just not in a format suitable for analytics tools. Big data in the wild doesn’t do anyone any good.
To make big data cough up its secrets, organizations must move it to their data environment and convert it into a usable format. The process of cleaning data into a format that enables “actionable insights” can be called “data optimization.”
Data optimization often has four steps: data virtualization, preparation, exploration, and prediction. These steps are how organizations start collecting the actionable intelligence and insights from big data that lead to more sales and lower expenses.
This first in a series of four blog posts explains how each of these steps relates to the next one. It also discusses why it’s crucial to learn the right sets of skills and competencies to enable data monetization.
So, let’s start by talking about data virtualization.
Using Data Without Knowing Where It Is
The first step toward making the most of big data is data virtualization. Data virtualization lets an application retrieve and use data even if that application doesn’t know the data’s format or physical location.
Here’s an example of how that might work. Think of a major global architectural engineering firm. The firm has three decades of onsite photos taken by project engineers. These photos show how work has progressed in the field. Think about syncing all of these photos on employees’ smart phones with their laptops, desktop computers, and digital cameras. The firm gathers all of these images in the cloud to make slideshows for potential customers.
In today's world, it is not unusual for a large corporation to have its data reside in various locations—from on premises to the cloud. The corporation might have sales data on employee desktops, as well as on CRM cloud or partner networks. Anyone putting together a presentation that needs information from all of these different data sources will find it tough going.
Data virtualization can connect to all these data sources with adapters. It uses this approach to pull together all the data automatically in real time and on a constant basis.
Instead of dealing with multiple different sources of data, data virtualization automates all the data interfaces at all these different layers. Now all that data sits in a virtual data bin where you can work with it.
This is data virtualization. It enables you to retrieve data without having to know its new file path. You probably don’t know where all of the data is now stored because an abstraction layer hides those technical details. This abstraction layer is data virtualization.
Data Virtualization Offers Advantages
Data virtualization has a lot to offer. It cuts the costs of data replication. Not moving data lowers the expenses of storing and networking data. Organizations get more use out of existing servers and networks.
Data virtualization also makes organizations more agile. When they can get to their data instantly no matter where it is, they can respond much faster to constantly changing conditions.
Data virtualization makes data more visible. This speeds up decision making by a factor of 5 to 10 times. Better data visibility also leads to better decisions. Everyone in the organization who needs data to make those decisions can access it.
To improve IT’s effectiveness, it is critical to learn how to implement and use data virtualization technologies. As the first step in data management, data virtualization is the foundation for the entire process.
Without data virtualization, the simple act of finding all of an organization’s data is tough if not impossible. And if you can’t find your data, you cannot take the next step: data preparation. That is the topic of the second post in this series about the first steps toward making the most of big data.
For more information, please visit the Cisco Data and Analytics page.
Learning@Cisco product manager Neeraj Chadha has more than 20 years of experience in the networking industry. Over that time, he has functioned as a software developer and network engineer, and in various aspects of product management. Currently, he guides the overall product strategy and evolution of Cisco courseware and certifications around Wireless, Collaboration, and Big Data and Analytics. Neeraj's primary areas of focus include technology trends, digital transformation, continuing education, and product strategy.