Breaking the First Line: Quantifying the Risk-Reward of Progressive Passes in Build Up

Elliott Stapley looks for a way to quantify the benefits of progressive passes in the build up phase.

“Pep hates these full-back to winger passes because they don’t offer progression, but I used to do them with Messi a lot and he’d be annoyed.”

– Dani Alves

Inspired by a combination of a conversation with a friend who is coaching at Arsenal academy and the Dani Alves quote above, I decided to look into optimal ways of progressing the ball from your own third without taking unnecessary risk.

The idea behind this analysis is to identify the “best” options for passing in the build up phase to gain territory and increase the chances of scoring, relative to the risk of playing that pass.

Identifying methods of breaking the first line through passing

The first task is to define the range of passes we’d like to consider for this analysis. I went for all passes starting in the first third which gain 25% of the distance to goal. This is a pretty common “progressive” pass definition with an additional constraint, that the pass should begin only in the build up phase.

Using Kloppy, I loaded up all La Liga games in the StatsBomb open dataset. I selected passes that met the progressive pass definition, were not high passes, were from open play, not headers, and had at least one prior pass in the sequence. This is to ensure we are only including passes where a side is definitely in a build-up pattern and playing through the thirds.

The next step was to cluster this large dataset of progressive passes to produce the 15 most common pass types that fit the criteria. These are shown below:

To make this more easily interpretable, the 14 symmetric pass types either side of the horizontal centre line were paired and labelled, reducing us to 8 independent pass types which are hopefully all relatively easy to visualise:

Quantify relative gain of each pass type

We’ve now got a handful of the most common ways that teams break the first line. The next job is to assign some quantitative measure of how good, or how progressive each pass type is. I made the decision to use Laurie Shaw’s expected possession value (EPV) grid which quantifies the probability that a possession will end in a goal from the ball position.

By averaging over the start and end location of the passes included in each pass cluster type, we can produce an EPV gain value, which we will use as our pass “reward” measure. An example progressive pass type with associated EPV gain is shown below:

Quantify expected completion of each pass type

The next stage of quantifying risk-reward is to start on our risk measure. Inspired by Stats Perform’s expected pass completion model, I used a random forest model trained on pass termination (complete/incomplete) of all non-high, open play, non-headed passes in the StatsBomb open data, with features: start location, angle, and length of pass. Using the predict probability method and selecting the completed pass class, we get our expected completion rate for an individual pass. 

Example completions of three made-up passes are shown below. The values are higher than would perhaps be intuitively expected due to the fact that we are only considering open play, ground, non-headed passes:

If I was to attempt similar analysis in future, I would likely use a 2D grid method for start/end location rather than a machine learning model as it arguably adds needless complexity and required an extra level of validation.

We can now do the same as we did with the EPV analysis: compute the completion rate over all passes in each cluster and average them. An example progressive pass type, its EPV, and its expected completion (xP) are shown below:

Compare value gain and expected completion rate to identify safe/risky and low/high value progressive pass types

We’ve got our reward and we’ve got our risk, so we can now make some evaluations of the individual progressive pass types. For practicality’s sake, here is the pass type diagram again:

The most intuitive part of this analysis is that there is a direct linear relationship between value gain and drop-off in expected completion of each pass type, shown below:

In simpler terms: the more “dangerous” your pass selection in build up is, the riskier it is. You can also infer that some progressive types are marginally inferior in regards to risk relative to the value gain of playing them: primarily the straight passes from a wide defender into the winger.

It’s starting to look like Pep was right. This is also more assurance of the accuracy of the individual models used: the longer passes into higher/central areas are worth more EPV; the shorter, straighter passes are higher xP which follows expectation. 

Quantify risk/reward of each pass type to produce a complete picture

The last stage is to produce a complete picture of the risk-reward of our passes. It is important to recognise that a turnover doesn’t just result in you no longer having the ball: it also means that your opposition has gained possession in a potentially dangerous area.

We can go about quantifying this by combining the EPV and xP concepts:

  • xEPV = expected rate of completion * EPV gain

This is the expected value gain for your team when playing a pass: how likely it is to be completed, multiplied by the gain.

  • Opposition xEPV = (1?-?expected rate of completion) * opposition EPV gain

This is the expected value gain for the opposition when you are playing a pass: how likely it is to not be completed, multiplied by the gain for your opponent if they are in possession at the end location of the pass.

These two values allow us to compute the net xEPV of each pass type to isolate the passes that relatively gain you the most value while also factoring in the risk of losing the ball at the end location.

This can be nicely summarised in the graphic below:

As an aside, I also considered a different risk relationship, where we use xP squared in the net xEPV calculations. This is a loose attempt to try and include the future value associated with retaining possession and the fact that most managers don’t like giving the ball away in build up.

This is shown below, and it paints a marginally clearer picture in my opinion:


Pleasingly, it is immediately clear that of the shorter passes, the diagonal passes breaking the defence to midfield line are the best option in regards to risk-reward. Relatively you gain more territory and I suspect passes are easy to receive due to the body shape of the receiver. 

Short, straight passes are relatively less optimal, likely for the same reasons that Pep dislikes them: you don’t gain much ground, you often incur massive defensive pressure to the receiver from behind, and you inevitably end up with the ball coming back into the defensive line usually via a bounce pass. 

I believe that both of these conclusions also meet the common tactical/intuitive opinion in regards to those passes which I hope helps provide some credence to the methodology. It also aligns with the ideal of the 3–2 build up shape which provides the maximal number of diagonal passes via triangles between the defensive and midfield line.

It is important to note that the number 1 ranked pass type according to both risk functions is the longest pass, and also the only pass that breaks directly from the defensive third into the offensive line. 

This feels logical to me: kicking it further brings vastly less risk if there is a turnover with a large relative gain if you retain the ball. It reminds me of quote from Daryl Morey, former GM for the Houston Rockets NBA franchise: “football cares too much about possession and not enough about territory”. 

Judging from this (admittedly niche) analysis, there’s probably some truth to this. Packing the maximum number of players, minimising risk and maximising gain in every pass might not be the nicest to watch but it may well be the most efficient way of playing football. Valérien Ismaël’s Barnsley “skyball” is probably a good example of this.

So in summary: if you’re wanting to maximise risk-reward when building out from your third, play diagonal balls into midfield when it’s on but play long into your winger when it isn’t. Don’t play the ball straight into deep wingers.

At Analytics FC, we provide software and data services to entities within football looking to realise the gains possible from analytical thinking.

Find out more about us, or get in touch if you have a question!

News, straight to your inbox

Provide your email address to subscribe and get email updates