I am not a data scientist, but I find myself working extensively with data in my work as a management consultant. Data science can be a valuable tool for solving business problems, but the companies that I work with make little use of data science methods. There is a sense that the discipline is a black box to be left to the experts. That is a missed opportunity for creating business value.
When you look at the statistics being reported, most data science projects are failing to deliver tangible business results. While the interest in data science and machine learning has been growing since 2012, only 20% of analytics insights are expected to deliver business value through 2022, over ten years later. So the problem is widespread, and not improving.
Senior leaders are missing the opportunity to create value from data for two reasons. One is unfamiliarity, since many of the techniques are evolving rapidly, and non-experts cannot keep up. Another is a belief that data science is so complex that business leaders cannot hope to understand it. Both of these reasons need to be tackled, since they lead to a hands-off approach where a team of experts is brought in to figure things out. Meanwhile the executives are not clear on what’s going on, and are not sure if anything will come out of it.
Data science initiatives need to be managed like other business improvement initiatives. This requires business leaders to engage with the topic of data science rather than treating it as a black box. Investing time in understanding how data science works, working with the data science team, and managing the data science initiatives will enable executives to get better results.
A basic explanation of how data science works
From the view of a non-expert, here are some basic things that can help a business leader to start understanding what data science is, how it works, and what it can do. Obviously, there is a lot more detail, which is often covered in “data science for executives” courses, and business leaders would do well to educate themselves on the subject.
There are different kinds of data scientists
The field of data science is very broad, and not every data scientist has the same skills. Some of them are specialised in data analytics or business intelligence, so they can segment and visualise the data. Some are specialised in algorithms and modelling, so they can build models that make predictions from the data. Then there are statisticians, who can run experiments and see if there are correlations in the data. Data engineers are specialised in defining, storing and managing the data in a form that can be used by other data scientists. The key thing to realise is that there is no single person called “data scientist” who has all these skills. If you’re hiring a data scientist, make sure that you understand what expertise (s)he brings.
Data models make predictions from patterns
Data science models work by recognising patterns in data and dividing data points into groups based on how similar they are to other members of the group. This is called classification: it is a prediction that a data point will be a member of a group. It works on the assumption is that data points with similar input properties are likely give the same outcomes.
For example, if data show people living in cities (input) are more likely to buy an electric car (outcome) than those living in villages (input), then a data model will predict that a new person living in a village (input) is not likely to buy an electric car (outcome). This classification via pattern recognition is used for speech-to-text translation, language translation, image recognition, churn prediction and so on. It’s important to realise that the classification is an estimate with a probability level, not an absolute yes/no prediction.
There are different ways to do the classification, which have different levels of complexity. Some methods will provide a “logic” that is understandable (e.g. a weighting of factors or a logic tree) for why the classification came out the way that it did. Some others are a black box without a human-understandable rationale for the outcome. This applies most famously to artificial neural networks, the method underlying Machine Learning or AI.
There is a popular belief that many problems can be solved by throwing AI at them. However, to be able to use AI well requires a clear understanding of the process in which the AI will function, a large amount of data of high quality to train the model, and a good way of testing the model in the real operating environment. The AI will provide an outcome without a rationale, and it will reflect the biases and shortcomings of the training data. If the rationale is important (e.g. which factors were most important for the outcome) then AI is not a suitable tool.
Data models need specific questions
Since data models work by classifying data points into groups, a business question needs to be translated into a classification question. The question cannot remain formulated in a vague or open manner, such as “How do we grow sales?” A more suitable version of the question is “Which of our customers are more likely to buy an additional product from us?” and a data model could classify customers by comparing them to other customers who bought additional products. Then the sales team can approach those customers with offers and expect a higher level of success based on the prediction from the data model.
Not formulating precise questions happens quite often. Thirty percent of data scientists report not getting clear questions to answer as a barrier to their effectiveness. Specify business questions more precisely to be able to create data models for them.
Analytics can cover the past and the future
There are four types of data analytics: descriptive, diagnostic, predictive and prescriptive. The first two answer questions related to the past (what happened, why do we think it happened) and are the domain of business intelligence and data analysts. The second two answer questions related to the future (what could happen, how could we intervene) and are the domain of modellers.
When answering a business question, all four types can be used. For example, if the question is “How can we sell more electric cars?” then a descriptive analysis can show who has bought electric cars, of what types, at what prices, and so on. A diagnostic model can find the factors that are correlated with buying electric cars. This model can then be used to predict the buying behaviour of future customers, and a prescriptive model can recommend certain actions to increase chances to buy, e.g. offering at the right price or with certain in-car options.
Managing a data science initiative
Business leaders already have extensive experience with managing business improvement initiatives. The same principles apply to managing data science initiatives. Leaders need to be hands-on to ensure that projects have clear business objectives, are well-scoped and managed, and are followed through to implementation.
Start with business objectives
Many data science projects, when run in a silo, fall into the trap of thinking only about collecting data or implementing a tool e.g. “What can we do with our customer care data?” or “What can we do with machine learning?” While those are fine as exploratory questions, they’re not the best way to scope a good data science project. How can you measure success with such an open project definition? What’s the ROI of the project? Always start with a business objective such as “How can we increase customer conversion in our sales funnel?” If the project lacks clear business goals, it’s more likely to get cancelled than to deliver value.
Make a data strategy
When you start a data science initiative, you will undoubtedly want to create a data or analytics strategy. Just like with other functional strategies, it must clearly follow from and contribute to the company-wide or business unit strategy. Data scientists are not trained in business or strategy. Don’t hire a data scientist and ask them to tell you what to do with data. Bring together a team including the business leader, strategist, IT platform specialist, data privacy specialist and data scientist to make the plan together. Then check that the plan supports your business objectives and will provide an acceptable ROI.
Allow time and effort to get good data
The biggest barrier to success in a project is missing or poor quality data. Data scientists can spend 80% of their time on finding, cleaning and reorganising the data before the analysis can begin. Data are often inconsistent at the source, for example when there are no agreed definitions of what a data field means, data have been entered with errors, or data from disparate systems need to be reconciled. The data ownership can be unclear, or data can be hidden in a different silo within the company that others are not aware of. Be aware that it can take a lot of time to get the data collected, cleaned up, put in a consistent format, and in a single database ready to be analysed. The insights from this process may also lead to a larger initiative for data quality improvement at the source, and a change in way of working in the company.
Manage the project to avoid disappointments
Data scientists join a company because they want to solve complex problems that have a huge impact on business. They end up disappointed when reality does not match up: data are missing or poor quality, the IT infrastructure is lacking, or the results of their work are not being used in decision making. Some may leave their jobs for other companies where they hope things will be better.
Business leaders can become disappointed when they don’t see tangible outcomes of the work being done. If the results of the analysis are not explained in language that they can understand, they are less likely to act on the recommendations. When they invest a lot on data science without seeing the return, they may cancel the project or disband the team.
A good project manager or team leader is essential to ensure elimination of bottlenecks, cross-functional collaboration, good communication, and translation of the results into business language. A business leader should act as the sponsor to ensure that the project is well-supported and connect it with other disciplines to avoid working in a silo.
Follow through to implementation
The data model is not the end point of the project. After the team delivers the model, it is important to bring the results into the business teams: how do new insights from data change the teams’ objectives, processes, decision-making or metrics? Who needs to do things differently and how does that create value for the company?
People with change management skills are needed to implement these changes into daily work. Make the data science team part of the change implementation process, so that they can see the impact of their work. This gives everyone the chance to learn and improve for the next project.
Data, analytics and AI are changing how companies compete, so it’s becoming increasingly difficult for companies to ignore them. However, most companies are still at the beginning of the learning curve in implementing data analytics well. The unfamiliarity and complexity of data science form a barrier to adoption, but it is one that business leaders would do well to overcome. By understanding how data science works, engaging with the data science teams, and taking a more cross-functional approach, executives can manage data science initiatives to get business results.
© 2019 Veridia Consulting