Explainable AI and Black Box AI: What’s Best for Your Data?
Machine learning brings us an opportunity to untangle massive and complex data sets by identifying dependencies between different parameters. In this way, we receive valuable insights based on the decisions made by a machine.
Before, the decision-making process was crystal clear as ML models were quite simple. But with time, the complexity of machine learning models has grown exponentially, and we now face an issue called "the black box AI," or unexplainable AI. So those companies that wish to use ML models for predictions come across a dilemma: whether to choose simplicity with lower accuracy or high accuracy that cannot be explained? This article aims to clarify how the black box AI can be explained and why explainable AI models don't work for everyone.
Glass-box AI: simple and transparent
It all started with relatively simple models, such as linear and logistic regressions and decision trees. There is a certain (not too big) number of parameters in these models, and one can easily explain the results. For instance, we receive prediction Y because it results from simple computations based on parameters A, B, C, and independent variable X.
Because of its simplicity and 100% transparency, this explainable AI model got the name of glass-box AI. Think of it as if you are looking through the glass and seeing the parameters matching each other. However, this simplicity also brought particular challenges.
The main challenges of explainable AI
Because the glass-box model uses a limited number of parameters, its challenge #1 is that it's not suitable for working with complex dependencies with lots of parameters. Hence, it's also not the best choice for large enterprises that need massive datasets to be processed since this model lacks the required capabilities.
The second challenge is accuracy. Unfortunately, the simplicity of explainable AI models comes with a cost. Due to the nature and logic of used modeling techniques, a glass-box model cannot provide highly accurate results. Therefore, you cannot rely on it for an important business decision. This inconvenience leads us to the second option - the black-box AI.
Black-box AI: the mystery under the cover
As the number of complex interdependencies grew, the model's computing power requirements also grew. The glass box was not enough anymore, so black-box models appeared.
It's easy to guess by its name that the black-box model is a model that is not transparent and does not provide any explanation of the obtained result. In the name of fairness, it's important to state that black-box models can have billions of parameters, so obviously, an average human being would have a hard time understanding how all these parameters interact and form relationships.
Because of the number of used parameters and overall architectural specularities, black-box models are highly accurate in their results. These models are designed with accuracy in mind - but they turn out to be completely uninterpretable.
The big problem of black-box AI
Since black-box models process a great number of parameters in a record time, even a skilled data scientist cannot really explain where a certain decision came from. While it may not sound too bad at first, think of the following.
Suppose a financial organization (let's take a lending company, for example) decides the creditworthiness of a person. In that case, this person or responsible manager may question why the company denied the request. The person might also want to resolve the issue in court - and the finance company will not be able to explain its own decision. Or, if a financial company makes a particular decision and later loses money because of that, there is technically no one to blame but the machine.
As well, the inability to explain the black box may lead to compliance issues. Many black boxes are entirely incompatible with EU data protection laws and therefore have to be redesigned. So how can you battle the problem while not spending too much money and time? Luckily, there is a good option.
Explaining the inexplicable: from the black box back to the glass box
The solution to the black box problem is elegant and straightforward: we build a glass-box model based on the same input data that we feed to the black box and predictions generated by the black box. Of course, we will use fewer parameters for the glass box, but the principle remains the same: we use this data to obtain results and then compare them with the black-box results. This explainable AI model is also known as the surrogate model. Its main goal is to explain why the black box model results are trustworthy (or not) and how the model achieved them.
Some may ask: why don't you build a glass-box model from the start? As mentioned earlier, a glass box cannot process massive volumes of data, and some businesses do not require to double-check the results delivered by the black box. Thus the surrogate model explains how the black-box works.
A case from SoftTeco: management of traffic rates with the help of ML predictions
A client addressed SoftTeco with a request to design an ML-powered solution to analyze the data and provide forecasts and insights on traffic rates management. Before, it was the responsibility of data analysts, but the client wanted to automate and speed up the process.
However, our team faced the same challenge that we discussed above: the client needed an explanation for the obtained results in order to compare them with the results delivered by analytics. Therefore, we had to build an explainable AI model (the surrogate model) to prove that our black-box model was accurate and corresponded to the client's requirements.
It is pretty challenging to make critical business decisions related to price-setting without transparent and explainable justification. At the same time, it is impossible to obtain accurate results from an explainable AI-based model. This is how controversy is created around using machine learning for drawing predictions.
Hence, when you plan on implementing a black-box AI model in your business, we highly recommend supporting its results with a surrogate model. In this way, you will safeguard yourself from the possible legal consequences and will be able to provide explanations for your business decisions at any time and upon request.