Last week I was playing around with some things and decided to look at something that I have always wondered about. I have long suspected that there is a very high correlation between turnovers (miscontrolled and dispossessed) and attempted dribbles.
This is really clear when you start making some radars of attacking players where the ones that dribble a lot almost always have the worst turnover ratings. It makes a lot of sense, the more you try to move past players with the ball at your feet the more often you will lose it.
One of the things that I never did however was to look at what the correlation between the two are so here it is: 0.769. While I had the data out I thought it would be interesting to check a few other things as well, Fouls suffered has 0.64 correlation and fouls and dribbles have a 0.558 correlation.
This did bring me into wanting to also see about making a crude expected turnovers measure. The reason I wanted to look at something like this, is that I really like that SmarterScout has the measure for ball retention skill and I wanted to take shot at building something like that my self. This won't be a perfect measure because I can already imgaine things that can help improve but maybe in further itterations that can improve.
So back to xTurnovers, I started out with a bunch of variables and ran a regression for them.
I guess first my population is players that played at least 900 minutes in the big five european leagues plus the Eredivisie from 2017-18 to 2019-20. I also forced the intercept to be 0 with the reasoning being that as you do actions that should either increase or decrease the probability that you lose the ball but everything should start from 0.
For the most part I think all of these make sense. Fouls and dribbles basically are direct representations of players getting into more situations where they are directely confronting the defense, the more of these the more often you are to lose the ball. The other thing that appears is that the closer you get to goal the more you lose the ball.
The actual distance carried and progressive distance carried don't really have much effect on things.
After a little more thinking I did refine the varriables a little more and came up with this list:
I removed out the carries and changed the final third things into areas that were in very dangerous areas and then the rest.
This gives me a simpler model but doesn't sacrifice much in explaining power.
I am going to continue to play around with this and see if I can use this as a way to help with the ball retention rating going forward.