Doing Analytics vs Scaling Analytics
Analytics, Data Science and Artificial Intelligence has made it to the mainstream vocabulary of every senior executive and becoming “data-driven” are amongst the most commonly stated term in the slide deck of most organizations. Yet there has been a constant struggle in achieving success in these initiatives and based on predictions by Gartner nearly 80% of these projects will never reach deployment. If you ask yourself, “Why?” — one of the most commonly cited reasons is “Culture” (HBR article cited below) which is true but there are other factors that influence as well
As a Data Science practitioner, one of the reasons I feel why these initiatives fail is because “Doing Analytics vs Scaling Analytics” — are different ball games. If you ask about the origin of the “Analytics team” in any big organization, you’ll hear a familiar story of 3–4 data scientists/data engineers working in a siloed team trying to build a data lake, exploring use-cases for data science and developing Proofs-of-Concept (PoCs) to show business the value of data. While this exploratory way of working is a great way to start, when it came to embedding these algorithms into the workflow of these organizations, most teams faltered. This is because Doing Analytics is largely an analytics problem that a Data Scientist can solve building models/algorithms whereas Scaling Analytics needs a lot more than a team of Data Scientists which I’ll elaborate below.
AI/ML can indeed do remarkable things with a lot of data, but to scale analytics the first question is, ‘Do you have the “right” data
A common acronym you’ll hear among Data Scientists when you start talking about “right” data is GIGO (Garbage In, Garbage Out). Most organizations are sitting on top of a multitude of data coming from various sources and they usually rave on how many bytes of data they collect.
However, petabytes of data does not translate to success in Data Science and most companies only realize this after they start their investigation to scale analytics. The problem occurs when you try to combine data across various sources, say combining sales data with marketing, or supply chain data with sales; that’s when the cat comes out of the bag. Though, these IT systems for Sales, Marketing etc., in isolation work like a charm, when you try to build pipelines to combine them — Data Quality/Integrity issues hamper your progress to scale analytics.
When it comes to problems with data quality, a frequently suggested way to solve is having a “Data Governance” model in the organization. While having that is important, you have to ask yourself two more questions
How “Digital” is your business and how much of your data collection process is automated
If you take businesses like Uber, Facebook, Amazon and Airbnb — their entire is digitized and in some way trackable. Scaling Analytics in companies like that is easier than doing it in more traditional businesses — like manufacturing, automotive, aerospace, etc. The reason is that there are still certain elements in the business process that aren’t completely digitized (or) has elements that rely on the manual data input. Hence, when more traditional firms like the above want to jump into the bandwagon of using Data Science, the results are not great since the data collection process isn’t standardized (or) streamlined yet.
Do you involve your Data Scientists from the very start of your project
In most firms, Data Scientists are involved only when it comes to developing models/algorithms. The Data Scientist should ideally be involved from the business requirement/understanding phase where you ideate the problem statement and start looking at data to solve.
"Petabytes of data do not translate to success in Data Science and most companies only realize this after they start their investigation to scale analytics "
If Data Scientist cannot be involved, it is critical to have someone equipped with the analytics knowledge to be involved so he/she can ensure the “right data” is available for building models. A lot of companies think twice to have someone “technical” in business discussions and end up taking projects where analytically you have a higher chance of failure. McKinsey had identified this gap and have written on this intermediary role of an “Analytics Translator”.
Assuming you have good quality data with which you can develop ground-breaking ML solutions, the next question of scaling analytics is, ‘Do you have the right infrastructure?’
Dabbling around with Python/R in powerful laptops churning out predictions is great for Doing Analytics but when it comes to scaling, Cloud is the way. While Cloud Computing has been around for over a decade and most organizations are embracing it, the conversation of Cloud is usually limited to the IT department. It is not seen as a strategic advantage and rarely finds its way in board room conversations. Gone are the days where IT is only an enabler for your business and in this post-pandemic, digital age — IT/Analytics/Tech is pretty much your business. As Goldman Sachs CEO Lloyd Blankfein said in early 2017, “We are a technology firm. We are a platform.”As Goldman Sachs CEO Lloyd Blankfein said in early 2017, “We are a technology firm. We are a platform.” This platform mindset for your business is critical for scaling analytics as the right cloud infrastructure helps you in achieving three things quickly apart from other advantages
i) Integration of data across your IT systems that enable your business and embedment of analytics models
ii) Orchestration of various tasks/models using cloud automation tools to track your business workflow
iii) Model Management and Maintenance which has recently led to this field of MLOps that can help finetune your models over time
Hoping you tick the boxes on the data and infrastructure, the next question to ask yourself for scaling analytics is, “Do you have an operating model tailored to use analytics across your value chain
Every organization has a decision framework that they rely on to run their business operations. While data is used in some part of decision-making in most firms, if you want to tap onto the compounded benefit of scaling analytics across your value chain, the way you think & operate must change. The industry experts call this the “last-mile challenge” which is defined as the final stage of analytics in which insights are translated into changes or outcomes that drive value. “last-mile challenge” which is defined as the final stage of analytics in which insights are translated into changes or outcomes that drive value. I’ve personally had experiences where we had amazing models built with great accuracy/metrics yet the business was just not ready to use the model. This ties back to the “Culture” aspect we discussed earlier and this is where you need a fundamental shift in 3Ps — People, Process and Purpose.
i) People - the analytics industry has grappled with this challenge for many years and the solution to this is straightforward; upskilling about data science and analytics. When I talk to people who want to use AI, they find models mystical that can solve anything for you but forget that in actuality they are mathematical agents working in tandem. This shift in people’s mindset will not happen overnight and hence organizations must prioritize data literacy and ensure everyone learns about it, to use these models.
ii) Process - the way you function as a business needs to change and this switch in mindset is better if it comes “top-down” in any organization. The leadership team must be aligned on having a data & cloud strategy across the regions, markets and all functional areas. Another important piece is the process is on “Explainability” of your analytics models (which is a big trend these days) that needs to be integrated into the very fabric of your business process, so people can trust these models better.
iii) Purpose - this does touch on the philosophical side of “Why?” you should even use analytics. The answer is “Efficiency and Optimization”. When you have a small firm, you may operate on expert opinion/gut-feeling and still stay competitive. But as your business volume grows, it’s significantly harder to make the right decisions and with every incorrect decision, waste is created. So, the ultimate purpose of deploying analytics across the company is to function more efficiently and to find areas to optimize even further.
In summary, if you want to be better off scaling analytics in your organization, you will need these key ingredients —
1. Do not jump the gun on Data Science. Try to fix your data first across your IT landscape and then go onto the science part
2. Embrace the cloud and actively use it across your business value chain
3. Think first about where you will be using analytics models to drive your business and then make changes in people & processes accordingly.