One of the things that I have always wanted to do and has been on my list of projects is creating a simple projection system to get an idea of what a player might do for the coming season.
What follows will be my 1.0 version where lots of additional tweaks can be done but it hit my minimum viable product goals of spitting out an estimate for a player based on historical data that is regressed towards a league average value.
The idea is based on the Marcel Projections that were initially created for Baseball by Tom Tango, as the minimum baseline that a forecasting/projection system should be able to get over. So actually getting this version 1.0 done will be helpful to see how things are doing as more fancy stuff is added down the road.
The name Marcel comes from Friends, where one of the Characters has a monkey with that name. With the idea that this system is so simple a monkey can do it.
This is the initial write-up about Marcel from his website about how to do it for baseball:
Now I will go through how I went and did this for soccer. For this example, I will use the data from Mohamed Salah as my player to illustrate. Let's project Salah's non-penalty goal total for next season.
Marcel projections use the last three years of performance, with the most recent weighted more heavily. We first multiply Salah's 2021/22 goals by 5, his 2020/21 goals by 4, and his 2019/20 goals by 3, and add them together. Then we need to do the same stuff to get a weighted minutes that will be what we expect for him next season.
Non-pen goals = (18*5) + (16*4) + (19*3) = 211
90's = (30.7*5) + (34.2*4) + (32*3) = 386.3/(12) = 32.2
Then, we calculate the league rate of goals per 90 for each of the previous three years (I have used positional filters here so it will only look at players in the same general position, and the value from the 40th percentile player rather than the average), and multiply those by Salah's minutes from each year and the annual weights, to get the weighted mean of goals for someone with the same playing time over the last three years.
21/22 = 0.23 * 30.7 * 5 = 35.31
20/21 = 0.28 * 34.2 * 4 = 38.3
19/20 = 0.25 * 32 * 3 = 24
The overall goal rate was 0.25 goals per 90 and this will go into the next part of the calculation for regressing toward the league standard. I am setting the league minutes regression rate at 70 90's played which is essentially 2 full seasons played and 6,300 total minutes, I think that is a pretty rough equivalent to the 1,200 plate appearances that was for baseball.
We will take Salah's values and the League rates that we calculated earlier and combine them in a weighted average:
goals: (211 + 17.5) / 90's (388.8 + 70) = 0.49 non-penalty goals per 90 minutes
Lastly, we multiply the goal-scoring rate by Salah's projected 90's played, and adjust the projection for age.
The baseball version of Marcel assumes that players will improve on their past performance until roughly age-29, at which point they start to decline when relative to their previous three years. I am keeping that assumption here but I am moving the age from 29, to 28 because when I have looked the age curve studies that seems to be the peak for the majority of soccer players.
The age adjustments work the same way as for baseball, for under 28 players we subtract their age from 28, multiply that by 0.005, and increase the projection by that much; for a player over 28, we subtract 28 from their age, multiply that by 0.007, and decrease the projection by that much.
Salah will be 30 next season so we take 1 - (30-28 * 0.007) = 0.986
So bringing it all together we get the following:
0.49 np Goals per 90, 32.2 90's played, 0.986 age adjustment for a projection of 15.56 non-penalty goals scored next season.
This can be done for any metric that you are interested in and the next step in how I did things was by using this to make a radar chart and table to show what that type of production might look like compared to previous seasons and peers.
As I think through things more I think I would be interested in trying to add in League effects for projections to try to see if it is possible to account for strength between leagues but that might still be something that takes a bit more work to determine how things look between leagues when players move.
As I update things I will update this page.