Accuracy and Trust in Machine Learning

A few weeks ago, I wrote about machine learning risks where I described four ‘buckets’ of risk that needed to be understood and mitigated when you have machine learning initiatives. One major risk that I *should* have mentioned explicitly is the risk of accuracy and trust in machine learning. While I tend to throw this risk into the “Lack of model variability” bucket, it probably deserves to be in a bucket all its own or, at the very least, it needs to be discussed.

Accuracy in any type of modeling process is a very nebulous term. You can only build a model to be as accurate as the training data that the model sees. I can over-optimize a model and generate an MAE (Mean Absolute Error) that is outstanding for the model/data. I can then use that outstanding MAE to communicate the impressive accuracy of my model. Based on my impressively accurate model, my company then changes processes to make this model the cornerstone of their strategy…and then everyone realizes the model is almost worthless when ‘real-world’ data is used.

New (and experienced) data scientists need to truly understand what it means to have an accurate model If you go out there and surf around the web you’ll see a lot of people that are new to the machine learning / deep learning world who have taken a few courses and thrown up a few projects on their github repository and call themselves a ‘data scientist’. Nothing wrong with that – everyone has to start somewhere – but the people that tend to do well as data scientists understand the theory, process and mathematics of modeling just as much as (or more than) the ability to code up a few machine learning models.

Modeling (which is really what you are doing with machine learning / deep learning) is much more difficult than many people realize. Sometimes, building a model that delivers 55% accuracy can deliver much more value to an person/organization that one that has been over-optimized to deliver 90% accuracy.

As an example, look at the world of investing. There are very famous traders and investors who have models that are ‘accurate’ less than half the time yet they make millions (and billions) off of those models (namely because risk management is a large part of their approach to the markets). This may not be a good analogy to use for a manufacturing company trying to use machine learning to forecast demand over the next quarter but the process these investors take in building their models are absolutely the same as those steps needed to build accurate and trustworthy models.

If you’ve built models in the past, do you absolutely trust that they will perform in the future as well as they’ve performed when trained using your historical data?

Accuracy and trust in machine learning should go hand in hand. If you tell me your model has ‘good’ MAE (or RMSE or MAPE or whatever measure you use), then I need you to also tell me why you chose that measure and what variances you’ve seen in errors. Additionally, I’d want you to tell me how you built that model. How big was your training dataset? Did you do any type of walk-forward testing (in the case of time series modeling)? What have you done about bias in your data?

The real issue in the accuracy and trust debate isn’t with the technical skills of the data scientist to be honest. A good data scientist will know this stuff inside and out from a technical standpoint. The real issue is in the communication ability of the data scientist and the people she is talking to. An MAE Of 3.5 might be good or it might be bad and the non-technical / non-data scientists would have no clue in how to interpret that value. The data scientist will need to be vary specific about what that value means from an accuracy standpoint and what that might mean when this model is put into production.

Accuracy and trust in machine learning / modeling has been – without question – the biggest challenge that i’ve run across in my career. I can find really good data scientists and coders to build really cool machine learning models. I can find a lot of data to throw at those models. But what I’ve found hardest is helping non-data folks understand the outputs and what those outputs mean(which touches on the Output Interpretation risk I mentioned when I wrote about machine leaning risks ).

I’ve found a good portion of my time spent while working with companies on modeling / machine learning is spent on analyzing model outputs and helping the business owners understand the accuracy / trust issues.

How do you (or your company) deal with the accuracy vs trust issue in machine learning / modeling?

Accuracy and Trust in Machine Learning

Get weekly insights on technology leadership