Quantity vs Quality: Building a ‘Tidy’ DM (or, In Search of the Next Sangare)

Alex Stewart builds his ideal midfielder, even though Ibrahim Sangare already exists

Scouting players is hard. Metrics can show actions by volume, while more sophisticated platforms can also show the impact or quality of certain actions. Players need to pass the eye test, while clubs will also dig in to a player’s psychological qualities too, trying to assess intangible qualities like professionalism or commitment, often by speaking to coaches who have worked with the player.

The starting point should always be: what does my club need and why? What problem am I trying to solve, or what gap to plug, or what position or area of the pitch to improve? Does my team struggle to break the lines or get the ball into Zone 14, or can we not do that because we never win the ball enough in midfield in the first place? Once that problem has been diagnosed, a club can begin to look for a player solution to that issue.

One way of doing this is by profile. Let’s assume that the style of our club is settled, with no managerial changes in the offing (this, incidentally, is why management succession planning and player talent ID go hand-in-hand). We can then begin to create a profile for the kind of player we want in a certain position.

Let’s also say that we know the answer to our question is Ibrahim Sangare – because if it isn’t, you’re asking the wrong question. Sangare is a central midfielder with excellent defensive output and a good passing range, who would suit many top tier sides. We could identify Sangare as the ideal fit for our team, but knowing that he is being linked with a host of Premier League sides, he may be out of our reach. And so, using TransferLab, we can create a profile that matches Sangare’s attributes and find players with similar qualities. The best thing about this is should our profile highlight Sangare himself, we know we’ve created one that works.

Sangare scores an absolute belter

Sangare’s strongest attributes are tackles, interceptions, and press-resistant passing under pressing. We want a player who can recycle the ball well, with intelligent, accurate passing, and put up monster defensive numbers.

Scouting by volume

But, how do we find players like Sangare, or create a profile for players of his type? There are a lot of professional footballers in the world, especially in the men’s game. This is where data come in handy, as a means of whittling down the considered pool to a more manageable volume, before the eye-test and other forms of investigation begin. But even data can be a blunt tool. A few years ago, one might simply look at per 90 metrics – say, tackles and interceptions – to begin to find good defensive midfielders, but this tells us nothing contextual about the value of those actions, or about the player more generally.

Nonetheless, as an exercise, let’s do that first. Using TransferLab, we built a profile that measured quantitative output for key Sangare metrics like tackles, interceptions, and passing. These were weighted, too, so that defensive numbers had more value than most passing metrics. Using twelve key indicators, but only looking at per 90 volumes, we generated a list of players from Tier 1 (which is comprised of Europe’s top five leagues) and Tier 2 (the top leagues in Belgium, the Netherlands, Russia, Portugal, Turkey, Argentine, and Brazil, and the English Championship). These were then additionally filtered by age and minutes played, of which the top 15 are shown below.

Sangare-lilke players by per 90 volume metrics only, aged 25 or under and with more than 700 mins last season

Sangare himself is on the list, which is promising. He’s not top, because this is not a player comparison tool, but a profile built around the kind of event-level data we want from such a player. However, while some of these players are excellent, not all are necessarily of the quality we are looking for. Volume alone is not enough: we need context, crucially the quality of actions, not just the quantity.

Fortunately, TransferLab’s position profiles also allow us to look for qualitative metrics that have a greater predictive value. We therefore select the most obviously relevant attributes, and a few others, and, again, weight for importance: defensive quality matters a lot, while certain of the more creative elements might register as less important (for the real nerds: the profile is based on 13 qualitative metrics, weighted, from 1v1 defence and tackles down to line-breaking passes and carries). The process is the same as our first profile build, but using predicative, qualitative metrics instead of volume alone.

Having done that, we can go back to check the list we generated above. Some of the players are definitely excellent: Sangare and Aurelien Tchouameni are undoubtedly great footballers. But here are two other examples. Using the quantitative profile, but removing the player’s names and scores, we get the following two, rather promising profiles:

Player A – quantitative profile
Player B – quantitative profile

But, when these players are run through the qualitative profile, which is based on the same kind of metrics with the same weighting, but not solely about volumes, the same players generate these profiles.

Player A – qualitative profile
Player B – qualitative profile

Player A looks superb compared to other midfielders in his tier by volume, but not great at all when the quality of those actions is considered. Player B, by contrast, looks very good by volume and pretty good by quality (which passes the eye test, too, as it happens). The point here is not to say that volume is always a bad indicator, just that it is nowhere as reliable as more sophisticated approaches.

Scouting by quality

So let’s use the more sophisticated one instead. Selecting only central and defensive midfielders in Tier 1 and 2, the qualitative profile gives us the following set of players when ranked by tier score. Tier score means their ability relative to the tier they are in, not to all footballers in the data set. And look who comes out on top!

There are two things to note here. The first is our model spits out Sangare as the highest quality, closest match to our ideal Sangare template among the players considered. We can also see a lot of really obviously good players here: it might be a pretty basic thing to say, but you want to start from a position where you are identifying dream options before starting to sieve based on cost or availability or whatever else.

We can then start to apply this profile with filters. Let’s look for the closest matches to Sangare in Ligue 2 last season, but who are 25 or under and have played more than 700 minutes. Nancy’s Giovanni Haag is the best of an otherwise underwhelming bunch (although Nimes loan acquisition of Jean N’Guessan from OCG Nice looks very shrewd, and we’d expect him to show up in this list next season).

Ligue 2 tidy defensive midfielders under 26 and with more than 700 minutes

Here’s what we get for Serie A midfielders with the same filter. Bologna’s Nicolas Dominguez is an excellent, under-the-radar player while Grigoris Kastanos impressed in an under-powered Salernitana side. He’s back at parent club Juventus for now but would likely be available as a loan signing again and could do well in an unglamorous role for a good team.

Serie A tidy defensive midfielders under 26 and with more than 700 minutes

Lastly, let’s look at everyone, irrespective of tier, but still with the age and minutes filter on, showing the club the players were at last season. The best Sangare-like player relative to the quality of their league is Hrvoje Babec of HNK Gorica, in Croatia’s top league, the Prva HNL. He has now moved to FC Riga for a high transfer fee for that league, £1.2m. Victor Ekani, who is head and shoulders above his peers in this role in the Danish SuperLiga is next, while Neven Djurasek, a new signing for Shakhtar Donestsk, is worth watching next season. The list is rounded out by Bautista Casicini of Academica Clinceni of Romanian’s third tier, now at The Strongest in Bolivia, and Ibrahim Sangare himself.

All tidy defensive midfielders ranked by tier score under 26 and with more than 700 minutes

Conclusion

This article has shown how clubs might choose to identify and then plug gaps in their squad, or upgrade positions, by creating profiles of player types, matching against an ideal, and then finding more realistic targets. We have shown how purely volume-based profiles can show up good results, but not always; the quality of players highlighted is patchy at best. And we’ve shown how using a more sophisticated, quality-based approach not only gives us close matches to our ideal player, but also highlights consistently good, interesting players from a variety of leagues.

This approach also yields another benefit: it allows us to tweak the profile. What if we wanted to up the passing ability of our tidy midfielder to find another Rodri? Or slightly scale back the defensive metrics and up the line-breaking carries and pressing metrics to find the next Moussa Dembele? TransferLab’s tool allows us to do just that and shows the increasing sophistication of how clubs can plot their route through the complex world of player acquisition.

Image credit: Shutterstock/A. Taoulit

At Analytics FC, we provide software and data services to entities within football looking to realise the gains possible from analytical thinking.

Find out more about us, or get in touch if you have a question!

News, straight to your inbox

Provide your email address to subscribe and get email updates