Big Analytics, Data Analytics with the world of driven automation

Data and analytics are transformational, yet many companies are capturing only a fraction of their value
Data and analytics have been changing the basis of competition in the years since our first report on big data in 2011. Leading companies are using their capabilities not only to improve their core operations but also to launch entirely new business models. The network effects of digital platforms are creating a winner-take-most dynamic in some markets. Yet while the volume of available data has grown exponentially in recent years, most companies are capturing only a fraction of the potential value in terms of revenue and profit gains.

Effective data and analytics transformations have several components:

Asking fundamental questions to shape the strategic vision: What will data and analytics be used for? How will the insights drive value? Which data sets are most useful for the insights needed?
Solving for the problems in the way data is generated, collected, and organized. Many incumbents struggle to switch from legacy data systems to a more nimble and flexible architecture that can get the most out of big data and analytics. They may also need to digitize their operations more fully in order to capture more data from their customer interactions, supply chains, equipment, and internal processes.
Acquiring the skills needed to derive insights from data; organizations may choose to add in-house capabilities or outsource to specialists.
Changing business processes to incorporate data insights into the actual workflow. This is a common stumbling block. It requires getting the right data insights into the hands of decision makers—and making sure that these executives and mid-level managers understand how to use data-driven insights.
Putting all these components in place is not easy. In a recent McKinsey survey of more than 500 executives representing companies across the spectrum of industries, regions, and sizes, more than 85% acknowledged that they were only somewhat effective at meeting goals they set for their data and analytics initiatives.

Data and analytics are disrupting business models and bringing performance benefits
Disruptive data-driven models and capabilities are reshaping some industries, and could transform many more. Certain characteristics of a given market open the door to disruption by those using new data-driven approaches, including:

inefficient matching of supply and demand
prevalence of underutilized assets
dependence on large amounts of demographic data when behavioral data is now available
human biases and errors in a data-rich environment 
In industries where most incumbents have become used to relying on a certain kind of standardized data to make decisions, bringing in fresh types of data sets (“orthogonal data”) to supplement those already in use can change the basis of competition. We see this playing out for example in property and casualty insurance, where new companies have entered the marketplace with telematics data that provides insight into driving behavior, beyond the demographic data that had previously been used for underwriting.

One of the most powerful uses is micro-segmentation based on behavioral characteristics of individuals. This is changing the fundamentals of competition in many sectors, including education, travel and leisure, media, retail, and advertising.

Digitization, more broadly, is also progressing unevenly among companies, sectors, and economies
The corporate world’s broader embrace of digitization is similarly uneven. Our use of the term digitization (and our measurement of it), encompasses:

Assets, including infrastructure, connected machines, data, and data platforms, etc.,
Operations, including processes, payments and business models, customer and supply chain interactions and
The workforce, including worker use of digital tools, digitally-skilled workers, new digital jobs, and roles. In measuring each of these various aspects of digitization, we find relatively large disparities even among big companies (Exhibit 1).
Exhibit 1

We strive to provide individuals with disabilities equal access to our website. If you would like information about this content we will be happy to work with you. Please email us at:

Workforce transitions in a time of automation
Read more
Our research finds that companies with advanced digital capabilities across assets, operations, and workforces grow revenue and market shares faster than peers. They improve profit margins three times more rapidly than average and, more often than not, have been the fastest innovators and the disruptors in their sectors—and in some cases beyond them.

Many of these top performers were “born digital,” but perhaps more impressive are the smaller set of incumbent companies that have actively transformed themselves into digital leaders and benefit doubly from their traditional strengths and their new digital capabilities.

There are also disparities between sectors in terms of degree of digitization:

In the United States, the information and communications technology (ICT) sector, media, financial services, and professional services are surging ahead, while utilities, mining, and manufacturing, among others, are in the early stages of digitizing. In labor-intensive industries such as retail and health care, substantial parts of their large workforces do not use technology extensively.
This unevenness can also be observed across countries; all have significant room to increase their digitization:
The US economy as a whole is reaching only 18% of its digital potential;
France has achieved 12% of its digital potential, the European Union average, while Germany and Italy are at 10%;
Emerging economies are even further behind, with countries in the Middle East and Brazil capturing less than 10% of their digital potential.
Digitization is transforming globalization, creating opportunities now for companies and economies
The world is more connected than ever, but the nature of its connections has changed in a fundamental way. The amount of cross-border data flows has grown 45 times larger since just 2005. It is projected to increase by an additional nine times over the next five years as flows of information, searches, communication, video, transactions, and intracompany traffic continue to surge.

In addition to transmitting valuable streams of information and ideas in their own right, data flows enable the movement of goods, services, finance, and people. Virtually every type of cross-border transaction now has a digital component.

Approximately 12% of the global goods trade is conducted via international e-commerce, with much of it driven by platforms such as Alibaba, Amazon, eBay, Flipkart, and Rakuten. Beyond e-commerce, digital platforms for both traditional employment and freelance assignments are beginning to create a more global labor market. Some 50% of the world’s traded services are already digitized. These transformations enable small and medium-sized enterprises around the world to compete head to head with larger industry incumbents.

In my work as a data scientist, I have noticed that many tasks that used to be difficult keep getting easier because of automation.
For example, AutoML promises to automate the entire model-building process.
While that is amazing, the work of a data scientist is much more than just implementing a machine learning model.
As it turns out, the aspects of data science that sound the sexiest will be the first to be automated, and the ones that are the hardest to automate are the ones you would least expect.
What most people focus on when you talk about data science is AI and machine learning. But data scientist actually spend most of their time on very different kinds of work.

This article will attempt to list all the types of work that a good data scientist should be able to do. For each of them, I will investigate how well it can be automated. Where appropriate, I will list some tools that can help with automation.
If you think there is a (non-commercial) product that works well to automate something, or if you think I missed an important aspect of data science work, then just send me a message and I will include it in the list.

1. Talking to the client

Status of automation: Impossible
The first and arguably most important step in the work of a data scientist is talking to the client.
This does not just mean “ask the client what the problem is”. There is a vast gulf in understanding between a business person and a data scientist.
The client generally does not know what the data scientist can do, and the data scientist generally does not know what the client wants.
It is extremely important to be able to bridge this gap in understanding.

Many companies actually employ managers to act as a go-between between data scientists and clients. This is better than letting a purely technical data scientist try to figure out the client’s needs. But a data scientist who understands the business context of the client is much better still, because it cuts out the middleman and reduces the risk that something important will get lost in translation.
Talking to the client will not improve the performance of the machine learning model (and like that, about half the technical readership of this article has lost interest). Instead, understanding the client’s pain points ensures that you work on building the right kind of model in the first place. The most accurate and best-performing model in the world is useless if the client can’t actually use it to drive his profits.

A one-hour discussion with the client can completely redefine the project, and increase the project’s monetary worth tenfold.
I will never forget the look on the face of one of my clients when I told him that I could easily extend my model to break down the data by a dozen different categories, analyze all of them, and then only report on the ones that had an anomaly. It turned out that the client had been performing this exact task manually for years, and it was costing him an enormous amount of time. He didn’t ask us to automate this, because he didn’t know it was possible at all until I brought it up. Without that discussion, we would have wasted months of work on less profitable tasks.

On a large scale, talking to clients ensures that you build the right kind of model, and is absolutely critical to ensure that the project leads to profit in the end.

On a smaller scale, talking to clients also has some immediate benefits that are no less important. For example, picking the most useful metric to train your model on. Many data scientists don’t think about this at all and simply go with Accuracy, or L2-loss, or whatever they were taught to use in university. A five-minute discussion with the client might show that their profit actually stems only from the top-5 results, or something like that. If you don’t account for that by altering the metric you use, you optimize your model in the wrong direction.

Virtually none of this can be effectively automated.
Talking to people is AI-complete. If anyone can actually figure out how to automate this task, then the robot rebellion will be just a few days behind.

Recent Posts

Big Analytics, Data Analytics with the world of driven automation

Self-Driving Car Technology: Self-driving cars take the wheel ?

The amalgam of blockchain with the internet of things

The difference between Business Analysis and Data Science

A Basic Guide to Quantum Computing