Thiago Bianchi, the Principal Data Scientist at Itau Unibanco, attended Worldwide AI Webinar to deliver a keynote on The Trustworthiness of ML Models in Banking. Here are a few highlights of his speech.
The path for data-driven ML model
Thiago Bianchi began his presentation by presenting the Data-Business Index quadrant to help enterprises position themselves and their data. The quadrant includes four parts with the x-axis being Business Index and the y-axis being Data Index.
Companies that are starting out with building businesses and obtaining data simultaneously are data-informed. Those that have already obtained a large amount of data but haven’t developed their business are data-centric.
Enterprises that are advanced business-wise but only have basic studies, dashboards, and insights based on data are considered data-savvy. To reach the data-driven part, these companies must acquire ML models.
According to Bousdekis et al. (2021), the data-driven concept is related to automatizing the decision process. Machine learning models are the tools to provide enterprises with such automation. But whether we can trust ML models is another topic.
ML models evaluation methods
To assess a model bias, Thiago suggested a few methods:
-
Considering linear function rather than polynomial function and seeing which one is more generic in terms of new data. In other words, you’d want to assess if you are in an overfitting or underfitting scenario. In Statistics, this is known as Bias-Variance Dilemma
-
Using the confusion matrix which compares the algorithm performance with the expected values
-
Quantifying uncertainty or certainty using conformity scores
-
And other known metrics
Thiago also shared that at Itau Unibanco, for all models of credit recovery area they deliver:
-
The traditional metrics for every new data: average precision, average recall, accuracy, f1 score, and others
-
The conformity scores for different levels of significance
To sum up, here are a few takeaways for you to chew on:
-
Traditional measurements hide the trustworthiness of ML models
-
Conformity scores work for all supervised learning scenarios and should be used alongside the traditional measurements