Overview

We turn to data analytics when we need insight into the metric(s) that will optimally deliver our end goal. Whether that’s for marketing purposes or to model an engineering commercial system that runs our organization… There are a number of ways to go about solving our data problems which starts with setting up our model. An analytical approach is more of a static method in building an exact solution, as opposed to, a numerical approach where we are essentially guessing and checking until we have met our accuracy requirements.

In order to get a model to Predict something for us, the model first helps us discover and take advantage of the patterns found in our historical data, transactional data, and organizational processes in order to identify risks and opportunities. And we can then choose to either leverage an analytical approach, using linear regression for example, or a numerical approach, using gradient descent for example, to predict an outcome.

But, won’t AI/ML always be better in the end?

No, and in fact might be more harmful than helpful. You may end up getting data that tells you exactly nothing of what you are actually trying to find if you feed it low quality data. And you may not even have the infrastructure ready to build an AI solution.

But say we have good data and the infrastructure, when do we want an ML vs a more static predictive approach?

Christi Luciani

Predictive models typically are very structured and almost always rely on historical data. Lots of hard rules. Whereas problems more suited for ML may or may not have a lot of historical data, and we may not have many known factors, based on historical data or otherwise, to come up with a solution.

Tl;dr

The benefits of ML are more pronounced when the problem being solved is vague, complex or chaotic.

 

 

An Example: COVID-19

I like Corona virus as an example. You may have seen some random LinkedIn “machine learning experts” tensor flow model of covid-19, which are often unnecessarily over complicated. Some of which have been so far off of what actually happened. While at the same time, a predictive model adapted from Sars, fed up to date information, has been spot on.

The key differentiator here being that if we didn’t have data from Sars (a similar scenario) then the ML models could have out-performed the predictive ones.

 

When exactly is AI/ML worth the investment?

If you do find a good use case for ML and the results will evidently be more accurate, then a cost-benefit analysis may reveal the hidden costs of all the hurdles still present and may help determine where the feasibility threshold lies.

Surveyed to understand the experiences a company might face when working with these technologies, Dimensional Research conducted a study with a total of 227 data scientists, AI experts and business stakeholders, from all 5 continents.

According to the findings [1], there are underlying training data issues that constrain a project’s success and here are some of the key findings:

 

  • 72% Report that ProductionLevel Model Confidence Will Require More than 100,000 Labeled Data Items
  • 78% of AI/ML projects stall at some stage before deployment:

 

  • 96% of enterprises encounter data quality and labeling challenges
  • 63% have tried to build their own technology solutions
  • 71% of teams ultimately outsource some portions of, or all ML project activities:

 

  • Teams that outsource data labeling get projects into production faster
  • 82% of participants have been satisfied with the ML results

 

Conclusion

If you don’t have the time and or human resources to budget in order to overcome the AI’s young industry challenges, you likely will not reap the benefits of ML. Most organizations would produce more accurate results with a predictive algorithm for less Money AND Time.

If you are still considering taking on this endeavor, despite the costs, and do successfully come out of development with a working model, you will likely find that your investment was well spent.

Outsource your data science work to a team of specialists that have overcome these challenges before in order to get higher quality models into production quicker.

 

Citations

[1] – Dimensional Research. (2019). Artificial Intelligence and Machine Learning Projects Are Obstructed by Data Issues: Global Survey of Data Scientists, AI Experts and Stakeholders https://cdn2.hubspot.net/hubfs/3971219/Survey%20Assets%201905/Dimensional%20Research%20Machine%20Learning%20PPT%20Report%20FINAL.pdf