by Bob Violino

6 tips for avoiding data analytics disaster

Feature
May 04, 2020
Analytics

Given its promise in driving business value, it’s no surprise that data analytics remains a top IT investment — but success is no guarantee.

Data analytics can be enormously valuable for companies, providing deep insights into data that might not otherwise be surfaced.

Because of this, data analytics continues to eat up a significant portion of IT budgets. Thirty-seven percent of IT leaders said analytics will drive the most IT investment at their companies this year, the highest single category, according to the 2020 State of the CIO survey.

But there are no guarantees that analytics investments will pay off. In fact, the discipline can be fraught with problems that can temporarily derail these projects or doom them to failure.

Avoiding negative outcomes is within the grasp of any company that wants to exploit analytics — it just requires putting in the necessary preparation and work. Here are some steps that organizations can take to avoid data analytics disasters and disappointments.

Have an overall data management strategy in place

One of the first steps a company should take is to build an overall data management strategy that defines the collection, processing, and analysis of data, says Seth Robinson, senior director for technology analysis at IT professional organization CompTIA.

“Companies have already taken similar steps with cyber security as a business-critical component of IT, and data management should follow the same path, since data has become so important to business operations,” Robinson says.

CompTIA recently released a report, “Trends in Data Management,” based on an online survey of 400 IT professionals in the U.S. conducted in December 2019, and it shows that many businesses are in the early stages of building their data management strategies.

Only 25 percent of the organizations surveyed feel that they are exactly where they want to be with their corporate data management. Although digital data has long been a part of IT operations, the report says, there has not been much focus in terms of job roles or defined components.

A big component of the strategy is having the right data analytics skills to meet the company’s needs.

“Data-related skill gaps are the [third biggest] challenge companies cite in building data management plans, and there are a range of different data skills needed,” Robinson says. These include database administration, data analysis, and data visualization. “Some of these skills could be taught to existing staff, while other skills may require new hiring or partnering,” he says.

Only 44 percent of companies say that they have internal IT employees who are dedicated to data management or data analysis, according to CompTIA. While there has been a focus on newer job titles such as data scientist, there is also opportunity around more

traditional roles, including database administrators.

“You must consult with and/or train your business employees to become data literate, or no one on your team will know how to begin the conversation around analytics,” adds Jeremy Wortz, a senior architect in the technology practice at consulting firm West Monroe.

“Not everyone needs to be a data scientist, but all business leaders need to have a basic understanding of how analytics can drive value.”

Make data integration a priority

The most common problems related to data analytics actually come early in the overall dataflow process, with a lack of data integration, Robinson says. “Without all corporate data linked together, analytics will be limited in finding connections and insights,” he says.

The CompTIA study found that integration of data is the No. 2 challenge companies cite in their data management strategy. Only speeding up data analysis ranked higher among challenges.

For several years, CompTIA research has found that business units working independently on technology initiatives eventually leads to a challenge with integration. As a result, organizations are trying to avoid shadow IT in favor of collaborative approaches that still give business units some freedom while maintaining an inclusive view of all business systems.

Gathering data into a single repository is part of this approach, the CompTIA report says, and is also critical for artificial intelligence (AI) initiatives that operate on the broadest possible datasets. The study notes that data silos are not widely considered to be a problem among those surveyed, even though data integration is a top challenge.

Considering that 82 percent of companies say they have a high or moderate degree of data silos, “there is a clear disconnect on how problematic data silos are and how exactly they should be integrated into a common dataset,” according to the report.

In addition to technical integration of data sources, enterprises need to establish data sharing processes between the various business units and the IT function.

“As with many other parts of IT, there is a growing need for collaboration between these groups,” Robinson says. “The business units bring the knowledge of which insights will be most helpful, and the IT team has the expertise in delivering the technical solution. Regular communication will build the proper feedback loops for refining data analysis to best serve the business.”

Practice effective DataOps

DataOps (data operations) is an automated, process-oriented methodology that can be used by data analytics teams to improve the quality and reduce the cycle time of analytics. It began as a set of best practices and has matured to become a new and independent approach to data analytics.

The method applies to the entire data lifecycle, from data preparation to reporting, and acknowledges the interconnected nature of the data analytics team and IT operations.

Similar to DevOps, DataOps incorporates the agile methodology to shorten the cycle time of analytics development in alignment with business goals. While DevOps focuses on continuous delivery of quality software by leveraging IT resources and automating testing and deployment, DataOps aims to bring the same improvements to data analytics.

“It’s essential for companies to implement DataOps fully” if they want to improve the results of analytics, says James Royster, senior director of data strategy and operations at multinational biopharmaceutical company Amgen.

The company has deployed a DataOps platform from DataKitchen and has had “great success,” Royster says. “DataOps involves designing analytics with built-in error handling,” he says. “Data analytics needs automated ways to test and control the quality of data, in order to reduce errors and avoid data integrity problems.”

Organizations regularly encounter data error issues that can jeopardize projects, Royster says. These include errors in the underlying data set. “Raw data must be cleaned and pre-processed,” he says. “Errors are common in any large data sets.”

In addition, sourcing the same data from different locations with different business rules can create errors. “Different organizations within the same enterprise may work with the same data using different algorithms, workflows, or assumptions,” Royster says.

Many also are not able to quickly connect and transform data to meet immediate needs. “The market evolves rapidly and business requirements change,” Royster says. “The data team must be able to update data transformations to keep up with the requests of users and stakeholders.”

Ask the right analytics questions

Organizations need to stay tenaciously focused on the key questions that can deliver value through data analytics, West Monroe’s Wortz says.

“The truth is that no matter how advanced your tools and technology are, your data will not deliver value unless you extract insights that drive strategic results,” Wortz says. All analytics, including AI and machine learning (ML), should generate insights, he adds.

The key to achieving this is asking impactful questions that easily tie to value creation, Wortz says. “How long does it take prospects to become customers? Why do customers churn? When do they churn?” he says. “Once you have baseline answers, you can craft hypotheses related to the business and begin the process again with new and more narrow questions.”

West Monroe worked with a client recently on a sales-focused AI and machine learning initiative.

“The machine learning model generated a significant amount of revenue for the organization, but we kept a pulse on insights from the dataset throughout the preparation for the algorithm,” Wortz says. “Many of us agreed that the ML work generated just as much value by finding generally applicable insights in the data, such as a specific customer problem in a certain region based on a specific product, as it did feeding data into the AI algorithm.

This gave the organization a quick value-add opportunity as West Monroe built out the long-term value of the ML system, “all while feeding a higher quality dataset to the algorithm,” Wortz says.

Analyze clean, accurate data only

This practice could come under the heading of building and executing an overall data management strategy. But it deserves mention as a best practice on its own. If the data being analyzed is not accurate, the results and insights will be tainted.

“The most important [step] in my mind is that the data must be defensible, understood, and accepted prior to delivering any insights,” says Kathy Rudy, chief data and analytics officer at ISG, a technology research and advisory firm.

“This means that data is clean, current, validated, and from believable systems of record,” Rudy says. “Clean data means that you have spent time reviewing and cleaning up the data prior to any analysis. This can often take a considerable amount of time, especially if you are working across databases to deliver reporting.”

But it’s a critical step, Rudy says, and often referred to as master data management.

“Management must buy off on the sources, currency, and accuracy of data or they will not buy the results and you will spend more time defending the numbers than delivering value,” Rudy says. “It will also create unnecessary cycles for the data team and potentially cause you to lose credibility.”

Having a solid technical foundation is important, “especially as it relates to data readiness,” says Pratyush Rai, CIO at Kaplan Higher Education, a provider of online student services. “In many organizations, insufficient attention is paid to the underlying architecture. This leads to duplicate records and dirty data, often making analysis challenging.”

Create a cohesive, collaborative analytics team

Being successful at analytics and avoiding disappointments takes teamwork, and this often means eliminating department silos.

“Organizations struggle to create and share data experiences because data is stored across multiple silos and lacks tooling for governance, data discovery, cataloging, and collaboration across engineering, analytics and business teams,” says Maksym Schipka, CTO at Vortexa, a company that provides analytics services for the energy industry.

“Structure your teams as multi-functional ones, balancing business analysts, data engineers, data scientists, software engineers, and quality assurance in one team,” Schipka says. “Avoid the pitfalls of having a separate ‘data science team.’ That’s a sure recipe for a failed project.”

Vortexa ensures that the analytics teams are fully on board with the choice of analytics tools it uses, such as a data operations platform from Lenses.io and cloud services from Amazon Web Services.

But regardless of the analytics tools in use, organizations should expect to have a combination of data scientists and data engineers in the data analytics team, Schipka says. “The exact ratio will depend on the complexity of the business questions that need answering, and the complexity of the necessary technologies to achieve that,” she says.