The recent wave of new opportunities to transform enterprise productivity, innovation and success leveraging artificial intelligence (AI) is driven by the tremendous advances in our ability to collect, operate on and extract knowledge from extremely large volumes of data. AI, machine learning (ML) and other aspects that constitute the digital transformation have passed the tipping point and is becoming pervasive part of the society and is an absolute need for most enterprises. If not done with adequate care and knowledge this rapidly evolving opportunity currently comes with catastrophic implications for the enterprise that will be hard to course correct by throwing resources at.
In the first of this two part post on adoption of artificial intelligence in enterprises, we dealt with various aspects of data quality effects on digital transformation in the context of the popular adage garbage-in-garbage-out (GIGO). In this second installment (Part 2) the theme centers on the power and pitfalls in the use of metrics from and for AI and machine learning (ML) based solutions. I highlight common problems in use of metrics and possible solutions that span beyond technical aspects to include enterprise architecture, culture and education in leveraging these multi-disciplinary advances appropriate to different organizational roles.
Significant portion of what I state here is guided by experiences, observations and inferences often through my lens as the leader of the outcomes intelligence technology company ReSurfX, with a primary focus on the healthcare sector. The powerful prediction and analytics solutions offered by ReSurfX as an outcomes intelligence suite to improve enterprise innovation and productivity from data-intensive operations through our SaaS product ReSurfX::vysen provide incredible robustness of outcome and value to our customers. ReSurfX does this in significant part by leveraging a novel data-source agnostic machine learning approach Adaptive Hypersurface Technology (AHT) that we invented to overcome many challenges posed by GIGO and use of metrics that are the central themes in these two posts of this series and many others needs for effective utilization of large volumes of data. An extended version of this article ‘AI for Enterprises: Metrics for Matriculates? [Extended Version]’ is on the ReSurfx website referencing a powerful use case derived using AHT based ReSurfX solutions leveraging AHT and delivered using the enterprise SaaS product ReSurfX::vysen in the healthcare space. I take this opportunity to thank the ReSurfX team and variety of experts and leaders who contribute directly or from the sidelines shaping growth of ReSurfX and me.
AI and metrics: The silver bullet for enterprise transformation can reap enormous value that is also an extreme hazard that can fail the best
Most of us have heard the phrase in some form that ‘any decision that is not supported by data is a guess or opinion’, so in data-driven decisions using that phrase as a norm is a great goal to strive for. While the day we can completely support that phrase will be awesome, the phrase currently preaches an extreme that undermines the power of human brain in contextualization as well as using experiential and accumulated knowledge. At this time the insights generated from digital advances significantly trail the power of human brain, knowledge and experiential power (that cannot be replaced in the near future). However, the ability of digital advances to gather and crunch enormous amounts of information and infer complex insights, beyond the capabilities of the most humans can uniquely enhance brain power and experiential knowledge. The previous fact on the complex interactions on so many parameters that these advances can infer from makes it a double-edged sword – i.e., difficult to parse and achieve otherwise, but often difficult to validate or reliably select the right approaches to predict outcome so can lead to expensive failures and/or significant loss in effort and time.
Having highlighted reasons for and against absolute reliance on need for specific data to support a decision, let us look at how we tap the digital advances to our advantage rather than let them be become metaphorical a ‘red herring’ sending teams and enterprise chasing wrong leads.
Emerging experts in ML and AI or quantitative disciplines in general are trained to evaluate approaches using metrics, many of these metrics have been around for a long time.
The key tool (if you allow me to trivialize that need) by which we can communicate the power and reliability of what we get from data-driven insights is through metrics. Metrics in this context could be from and for AI and ML based solutions that you develop, provide or ingest as business and technical leaders and/or experts for effectively achieving enterprise goals and expansion of digital transformation. We highlight opportunities and challenges spanning beyond technical aspects to include enterprise architecture, culture, and education on this multi-disciplinary advance appropriate to different roles.
We have always relied on metrics to help us have standards to measure our progress or lack thereof. In the same vein they can help us leverage the advances in data-driven ML and AI solutions to catapult us forward, however in this case they can be equally misleading and perilous. Some of the reasons for that are: the multidisciplinary nature, continuing rapid evolution and the need for an educative phase tailored for each role to maximize business value, the fact that we emphasized multiple times in the previous post.
The approach we take at ReSurfX to overcome these problems are:
- Reducing or understanding excess reliance on classical quantitative metrics as a measure of functional value.
- Incorporating the concepts on which classical metrics are built on to build our own metrics that are more appropriate for functional end-use.
- Building processes and value differentiation approaches driven by need to solve the problems we observed to be prevalent and failure of classical approaches to solve them.
- As we highlighted in previous post of this series and many other prior opportunities including the articled published in conjunction with being recognized by the CIO Review Magazine as 20 Most Promising Analytics Solution Providers in 2017. We posited that: “Dramatic improvements in accuracy and novel insights can only happen through innovation outside the mainstream framework – given that error properties are often non-uniform in big data, and most analytic shortcomings result from model assumptions not robust enough to handle that.” [CIO Review, 2017].
The novel machine learning approach Adaptive Hypersurface Technology ReSurfX leverages in solutions and value we provide to our stakeholders was invented by us, with motivations that include those mentioned above.
The success of digital transformation critically depends on the invisible innovation killer layer of enterprise structure
The cost of reaping the advantages from data, ML and AI that is evolving rapidly to become trivial currently seems far away as is the ability to pick approaches that perform uniformly well with reliability.
Naturally, not everyone would, should or could want to or need to become an expert in ML or AI – but the need to be reasonably informed in aspects of this advance that matter to their role is becoming a requirement.
Given these, most enterprises by construct, size, architecture or culture often ends up with what I describe below as an invisible ‘Innovation Killer Layer’. Guarding against this layer (which is no easy feat) by itself can significantly insulate an enterprise against these pitfalls and play a key role in utilizing the potential of digital advances. Many transitive and pseudo-AI product and service companies (that we alluded to in the previous post) use flaws in this layer typical of most target enterprises constructs to design their offerings, as well as to design their sales strategy – i.e., articulate validation, extensibility and value of their offering.
I try to give an outline of the Innovation Killer Layer which is a complex enterprise construct using a far simplified structure. All enterprises are supported by a variety of the allied divisions besides those that are formed with expertise that represent their mainstream focus. The former are typically divisions based on well-established expertise/disciplines (IT layer is an excellent example of former). Now management decides to double down and rapidly build up capabilities for digital transformation, establish AI/ML division as an essential part of that operation not wanting to be late or leave the opportunities behind. Often this division is setup with lot more resources and power. Needless to say, funding to a division and power to the members is considered proportional to priorities of the business leadership team. Unlike other allied sectors this division is anchored on a relatively new rapidly evolving advance with mixed bag of success. If we assume AI/ML is not part of the primary focus sector of the enterprise, this division has to understand the needs and integrate with core function divisions among others. The leaders of this new division have to prove themselves while adapting to rapid evolution to neither under-equip nor build up capabilities that will become obsolete at the next stage of evolution that is not far away. Very few people are highly experienced in leading this class of division and also have sufficient sector specific knowledge or prior success relevant to most sectors that could be reliably be translated into the success and integration. Their team members, however talented, often use approach of matriculates – adhering to established principles – we use metrics to highlight the effect of that approach in this post.
Though all divisions are equally important from operation and efficiency perspectives of the enterprise, sector specific enterprises divisions based on core knowledge of the sector are the back bone of the business. These sector subject matter experts (SMEs) and the core divisions of the enterprise like any other members or divisions are also pressed for results and need to contribute rapidly to the ROI. However, unlike the newly formed AI/ML division, they do not get the slack period.
The new AI/ML division with high matriculate like approach in order to develop and evaluate solutions now reaches out to a core needs division for ground truth. In reality such ‘ground truth’ do not exist in easy to ingest form or are quite nuanced. The leaders or team members of that division having their own targets to accomplish provide data from the past in their specialty for use as ‘ground truth’, stressing that many caveats come with that. The team members of the data science and ML bucket take the ‘ground truth’ (sic) and develop solutions and evaluate their model or algorithmic performance based on that. The real value of those metrics of model or algorithmic performance including due to above reasons significantly differs from those of true functional and business need. Results based on these metrics get insufficiently discussed, as few team members want to create a ruckus, which it will become if they elaborate in the time bound meetings of fast-paced enterprises, and those results in turn are now evaluated as metric for success in gearing for the digital transformation effectively. The whole enterprise often becomes embedded with this additional layer of compromise and the cycle continues. The impact of this will take some time to become evident, by which time most enterprises have paid the price in many ways including of culture, and often the plausible solution might be limited to get to another compromise solution.
Another manifestation of this construct that I am aware of is the mid-level leadership and members in the new interdisciplinary divisions of the enterprise having to establish their need, contribution and value explicitly saying or implicitly practicing ‘what you do/offer is competitive to what I do so I won’t collaborate with you’ to resources with overlapping expertise. In reality the person, entity, division or vendor on the receiving side of this behavioral response strengthens both this new division and the enterprise. The ones on the receiving end could never openly express their perception of insufficient value or contribution of the new division.
After all, most enterprises are not fond of teams or team members who would ‘rock the boat’ by complaining or ‘seemingly questioning’ an initiative highly coveted by the top management. We have recently seen this to be true by even in enterprise like Google, that tend to hire incredibly talented people with an expected trait of hires to be willing to speak openly to management of problems, getting into problems multiple times. The latest is when their CEO had to ‘kind of’ apologize by communicating ‘I accept the responsibility of working to restore your trust’ to his staff after the recent incident with their ML team member.
So the mid-level leaders that the enterprise depend on, who in reality are also the innovators with immense functional knowledge and know how to apply that knowledge towards the goals of the enterprise, form the Innovation Killer Layer in this scenario due to inadequate education, by necessity of governance, enterprise construct or circumstance.
Both examples above are results of insufficient thought into enterprise architecture, messaging about the new division, and time it takes to educate the new division and existing divisions (and probably the rest of the enterprise, starting with leadership) to collaborate on such multi-disciplinary and evolving opportunity. It also ties back to the fact that huge amounts of resources invested in that division is usually perceived as a signal of high priority to the management.
I am aware of at least one other extremely different approach a subset of enterprises take where the end effect is equally inefficient. However, discussing other scenarios or the solution to this problem of enterprise architecture is beyond the scope of this post.
In contrast to title of this section, enterprises should avoid the Innovation Killer Layer in order to be successful in setting up or doubling down on expansion of the AI efforts part of digital transformation. While many aspects lead to this effect, improper use of classical metrics and the mindset that are focus of this post serves as a powerful example. Further, this effect and need to avoid it are not restricted to digital transformation, but to most new multi-disciplinary initiatives that are significantly outside the core discipline that drives the enterprise offering(s). This is also a really hard enterprise structure design and maintenance challenges. Even if an enterprise is not resource limited, the cultural effects of getting this wrong cannot be easily corrected post hoc.
How can metrics that by definition are the core of oracle for performance and value become a cause for pitfall?
This topic and need again involves a complex mélange of people with different expertise in technical, functional, business and leadership treading across the boundary of disciplines and sensibilities. Here I am taking the approach of enumerating some of those factors followed by use case example that should help take these points in perspective.
- Multi-disciplinary nature of digital transformation (as we repeatedly highlighted in different context) and insufficient preparation of teams across organizational hierarchy and functional roles.
- No absolute ‘ground truth’ (or an oracle we can rely on) for most complex real-life problems
- Not using the right metrics appropriate to an enterprise need – often due to lack of knowledge spanning more than one discipline required for that enterprise need.
- Lack of classical metrics designed for such complexity.
This particular aspect became starkly obvious from fields that entered the digital wave later (e.g., life sciences and healthcare) – where one can see many thousands (if not tens of thousands) articles on approaches to reduce errors and adopting approaches and metrics that do not translate to true value they indicate and often are not robust enough across datasets even from the same platform they are developed for. I believe these aspects are not unique for those fields but became obvious as because they became digitized later and the field has a culture that encourages (often mandates) publications at far higher rate than most other disciplines. To this day this problem is not solved and the culture inherently prevents acceptance of new metrics. One extensive analysis where we documented this property extensively even in a simple case is in a whitepaper in ReSurfX website: ‘Many Roads to Reproducibly Wrong Results – Distinguishing the Truth Using ReSurfX::vysen and Other Analytic Workflows’.
- Problems with data violating the basis of those metrics. Examples include effect of bias embedded in data or non-uniform properties in large data volumes this advance (AI/ML) is built on.
- Definition of outcomes. Often the improvement in AI/ML is defined by metrics that are more theoretical or on assumptions of ground truth for measurement of outcome – so there is a mismatch between what the metric reports versus the practical business goals.
- Generation approach of results also leads to incorrect metrics that misrepresent or not rigorous enough and more importantly not robust enough. This aspect needs particular attention, as this is part of the trick used by whom I referred to as transitive vendors earlier.
- We referred to cases where significant assumptions and solutions with errors can still lead to great business outcomes in the “glass less than half full” section of the previous post as has been the case with great innovations that we rely on in our daily lives. So using approaches with reliance of practical value in validation than matriculate approach as we have been stressing in this post is more useful. This takes effort on the part of the end users and their leaders – there is difference in scope between the two sets of people even with the same enterprise goal.
- Related to the previous point, the value of improvement from new correct predictions/insights should far exceed the error rate.
- Metrics not sufficiently evaluated in practical business context – e.g., bigger decisions (as in enterprise value vs cumulative effect for errors with lower cost. Conversely (to latter part) – small, but reliable and robust improvements repeatedly applied process can lead to large savings in many different ways.
A key take-away from enormous amount of work done at ReSurfX evaluating information in the world is that in the context of practical use: well-designed validations highlighting reliability, robustness and extensibility for real-world applications that the enterprise is engaged in are far more valuable than relying on most classical metrics. Before we get to an example derived using novel advances in technology, functional solutions and products to deliver those solutions built by ReSurfX, I summarize the breadth we covered aimed at the various levels of enterprise hierarchy and knowledge differences of this multidisciplinary advance.
Concluding thoughts on successfully leveraging Data, AI and ML as part of digital transformation and a powerful example of result involving themes discussed in this post
In conclusion, between the two posts we covered several aspects affecting enterprise adoption of digital transformation. We particularly focused on the recent wave and rapidly evolving theme of AI solutions that primarily resulted from advances in our ability to collect and organize very large volumes of data and utilize them to collect insights and build business value that are becoming pervasive in our everyday life. We focused equally on the potential and immense risk that comes with the current state of these solutions in the context of effective adoption and deployment decisions while staying protected from disastrous consequences. Reminding oneself of the state of AI is as simple as recollecting our experience in engaging with voice or chat bots that are the first line of responders when you contact many vendors with some non-trivial need; or recalling how your pictures are identified or sorted by the apps from top enterprises. The two uses in the previous statement are built on the most advanced aspects of this AI wave – image analysis, voice and natural language processing.
I want to share with you one mindset that helps in effective use of Big Data initiatives and these advances centered on them using two statements that we often use at ReSurfX:
- An error rate of 1/1000 of data or decisions introduces a terabyte of errors with a petabyte of data.
- In large and complex enterprise operations the effect of errors are cumulative in the workflow and errors from early operations often cost you more downstream and could even balloon into a catastrophic event.
The farther along development of a product or offering that needs a long development process or insight generation phase, the larger are the impact of such errors. Not to say that it could be of any less consequence when unreliable (or improper metric of reliability) insights are used in bread and butter steps of each phase or division resulting in wasted resources and time that gets cumulative.
At ReSurfX we build solutions for both classes of needs mentioned in the previous paragraph with practicality of real-life needs often generating our own measure of confidence than simply using classical metrics. When building functionalities or broadly applicable core modules leveraging differentiated approaches such as the novel Adaptive Hypersurface Technology we invented, our customer value equation integrates these two components (GIGO, metrics) as well as aspects I covered in different contexts as key needs for reliable, accurate and robust AI functionalities. We deliver these as outcomes intelligence solutions through the SaaS platform product ReSurfX::vysen. Recently we have developed an application configuration ‘SyRTOP’ (Systems Response Based Triggers and Outcomes Predictor) by combining different stages of assembling and processing these datasets with a combination of AHT based analytics and prediction solutions we developed to get to outcomes and end-points of interest.
A powerful and exemplary result achieved by paying attention to the principles discussed in these posts and approaches developed by ReSurfX implemented in the SaaS product ReSurfX:vysen and SyRTOP in the last section of the extended version of this post ‘AI for Enterprises. Part 2: Metrics for Matriculates? [Extended version]‘. The ReSurfX website vysdom section, also has many find many relevant and useful posts and whitepapers.
Observations, inferences and thoughts shared here may not be those of ReSurfX and can sometimes be context dependent.
This post was written by Suresh Gopalan, Ph.D; CEO & Cofounder; ReSurfX.