ProspectXI – SCOUTING WITH STATISTICS

PROSPECTXI – SCOUTING WITH STATISTICS

By: Aaron Nielsen (@ENBSports)

With the OPTA Pro Conference and MIT Sloan Sports Analytics Conference soon upon us, the New Year is a great time to look at the analytical world of football and developments in the industry the past calendar year. I feel 2015 was a great year for the industry and its development. There were a lot of new and younger faces establishing a greater presence in 2015, but in my opinion the greatest addition is people who have a passion for the game of football and are asking the key question how analytics can help and develop the game.

OPTA recently announced their participants for the 2016 OPTA Pro Conference in February, most presenters are people who I’m aware of and follow on a regular basis on twitter. Personally I was asked to present a poster for my own abstract idea and I was flattered to be considered, but couldn’t travel from Toronto, Canada to London, UK at that time. The proposal I presented was called “The B Team Option” and was looking into how club’s run their reserve teams and the difference of countries who have traditional reserve leagues or leagues such as Spain, Germany, France and now MLS who are playing “B” teams in the lower divisions of their own domestic leagues. I was able to collect a great amount of data from Spain, Germany and France and have found some interesting trends and information in the data and I hope soon to release some of this work.

“The B Team Option” is part of a larger idea and work I been developing the last few years, which includes the launch of the site of ProspectXI.com and a concentration of work on Lower, Reserve, School, Youth, and Grassroots leagues. The reason for this work is I know from my experience in football and other major professional sports talent isn’t identified at 26.9 years of the age, the average age of players in the English Premier League, but at a much younger age and in leagues that have very limited coverage, information or scouting possibilities.

This is a bit different in North American team sports, where information and coverage is kept for most college teams as well as high school and youth leagues. Through research, player data is attainable for sports such as Ice Hockey, Baseball, American Football and Basketball and players are drafted at 18 years old in Hockey and Baseball with a fairly extensive record on the player and likewise recruited into college for players in all four sports. To allow soccer in North America to be competitive with these sports, I started producing my own database for the United States and Canada for players playing in established academies and clubs from the age of 14. I also started collecting data for youth/reserve leagues in England, Germany, France, and Italy in Europe over the past four years.

Data for these leagues is limited and in most cases information only includes Games Played, Games Started, Minutes Played, Goals, and disciplinary Cards while for some leagues I have also been able to get game commentary and collect goal details, Assists, and shot information. I feel this information is useful for initial analysis and have recorded similar data for close to 60 professional leagues in the World the past 20+ years and identified talent through analysis with this data – including Jamie Vardy while he was in the English Conference.

While analytical analysis in football has developed and companies such OPTA, Prozone, STATS Inc and others bring new data and metrics to the conversation, well people who work with this data demonstrated new ways to using this work. This development in the industry has open my own ideas in evaluating what is useful information or vital in scouting players through statistics. While also trying to make this realistic to record for leagues that have less media coverage or resources then traditional leagues covered by the main media sources.

Still the most important aspect in football analytics is identifying a players offensive contribution. In covering the career of 100,000+ plus players in close to a 1,000,000 games, I’m confident in evaluating Goals, Assists, and shots as good keys in evaluating a player offensive contribution. I tend to concentrate on direct Assists, as I find the variables for Key Passes not specific enough and for shots I concentrate on Shots on Target (SOT) over shots in General, which is used by many or Expected Goals. I do see the promise of Expected Goals, but find Shots on Target captures the performance of the player in more depth at the moment. That said, I’m excited to see how xG models develop and further work on “Danger Zone” analysis.

The next goal was finding the influence of a player on the pitch beyond generating offence, the most common sense approach being the total amount of Touches (Touch) a player has in the game. This stat is more difficult to collect then the other information, but I was able to do this in covering games while tracking the other information I record. There could be ways of simplifying touches, for example having an individual possession percentage for each player like we have for teams or counting only key touches. Touches do reflect a team performance as much as the individual player, but it allows us to look at a player’s role on a club or a great tool in comparing to other similar players in that league.

The other key stat to me was Turnovers (TO), as we see in other sports such as basketball. Turnovers in football can vary in terms of how you want to evaluate loss possession and can include every pass that doesn’t find a teammate, offside, fouls and shots that didn’t hit the target, as these all lead to potential loss of possession. I’ve limited turnovers to where it is directly the player’s fault, either losing an uncontested ball or the player’s pass was directly intercepted ending possession and opportunity for the player’s team and giving possession to the opponents.

One of the biggest issues in football is how do you evaluate defense and defensive minded players. I have looked at ideas such as +/- in terms of goals and shots and possession differential well the players on the pitch although again this reflects as much the club then the player. The one stat that I have found interesting is Defensive Contribution (DC), which in my work is the addition of interceptions, clearances, blocked crosses and blocked shots. I think the number of the whole represents the defensive players work rate and also how the club itself tactically looks at defense.

The final key stat are Fouls and disciplinary Cards for the player. When it comes to Fouls there are Fouls Conceded and Fouls Suffered, but working with coaches and other scouts the metric with most interesting results has been total Fouls the player has participated in as this shows either how aggressive or how actively the player looks for contact. While Disciplinary record of Yellow and Red Cards shows if the player is a disciplinary risk, but also shows how involved the player is in that aspect of the game.

These are general player statistics and each player can be further examined by looking at all the player individual statistics and analyzing each player’s Goals, Assists, Touches, Turnovers, Defensive Contribution, Fouls and Cards as well as the opponent, circumstances and game situation. Other accumulated metrics of interest include successful passes in the final third, duels, positioning on the pitch, as well as a players athletic and vital details.

We will be posting examples of this work from the English Premier League and German Bundesliga for the 2015-2016 season up to January 5th 2016 and the final data for the 2015 MLS seasons. The data was accumulated from a number of sources, not as criticism to any one source, but more to show that collecting this style of information is doable without large costs or through an established data providing company as the goal of this work is to be able to be produced for all leagues of football.

ProspectXI or I are NOT a data providing company and do not provide data in real-time. We are a scouting service who gather each season a players complete statistical and informational record and do analysis on players potential through this information, as well as traditional scouting tools such as video and seeing players play live. Our current database includes over 100,000 players worldwide.

For more information regarding ProspectXI please contact us at info@prospectxi.com or kamal@ProspectXI.com and follow us on twitter at @ProspectXI.

The following is Bayern Munchen club statistics for the first half of the 2015-2016 German Bundesliga Season.

Bayern

Data is averaged out over 90 minutes of play.