EURO 2020 - Hybrid Machine Learning

A Hybrid Machine Learning Approach for the Modeling and Prediction of the UEFA EURO2020

Conventional approaches that analyze and predict the results of international matches in football are mostly based on the framework of Generalized Linear Models. The most frequently used type of regression models in the literature is the Poisson model. It has been shown that the predictive performance of such models can be improved by combining them with different regularization methods such as penalization.

More recently, also methods from the machine learning field such as boosting and random forests turned out to be very powerful in the prediction football match outcomes. Here, we analyze both a hybrid random forest extension based on conditional inference trees and a hybrid boosting extension based on extreme gradient boosting for modeling football matches. The models are fitted to match data from previous UEFA European Championships (EUROs) and based on the corresponding estimates all match outcomes of the EURO 2020 are repeatedly simulated (100,000 times), resulting in winning probabilities for all participating national teams.

Video_10_09_2021.mp4