The news traveled the world, although it was hardly relevant in the Spanish news. The first case of computer espionage in the world of sport A theft of statistical data! A few days ago «The Times» uncovered the information: Liverpool there were hacked the Manchester City database between June 2012 and February 2013.
The suspicions of the City began to realize the interest of Liverpool for a youth of Zaragoza who themselves followed the track (Paolo Fernandes, which would end up signing for the City). This and other similar cases in which the "reds" added to the bid of possible reinforcements proposed by the scouts of the City, made all the alarms go off. In fact, the City was in need of accelerating hiring as Fernandinho Y Jesus Navas To avoid greater evils.
A computer team was hired who came up with the answer: they had hacked the database. The guilty ones? Michael Edwards, Liverpool's sports director, along with two scouts who previously worked for the citizen set. With their help, they accessed their database hundreds of times for 8 months straight ahead of City movements.
The City usually works with Scout7, a system that draws on an immense cloud of data (OptaPro) that is updated daily with the statistics of more than half a million soccer players of all categories and countries.
The end of the story: a confidentiality agreement between clubs and one compensation of one million pounds (more than 1.1 million euros). But how much are simple data worth? What is the importance of statistics for a renowned club to see the need to get out of the law?
New horizons in football
Although it may seem, today I do not write about football, this is an article by maths.
New technologies have changed this world completely and, therefore, the way of understanding sport. Before football teams they signed based on the opinion of their scouts. However, evaluating a player based on the observation of the game will always be subject to the particular interpretation of said scout, that is, it will always have a subjective character. With the arrival of statistics in sport, subjectivity has given way to objective data (for something they are called "exact sciences" to mathematics). The old scouts are taken into account as the one that takes into account the opinion of his grandfather for the wisdom he has accumulated, but it is the statistics of a player that define his contribution in the field.
"What nonsense, I could tell you hundreds of great players who don't stand out for their numbers!", You could say. That is true or, at least, with the data we are used to hearing.
What good is a forward who scores 20 goals per season if he needs 2,000 shots on target or if he misses all passes? What good is a defense that recovers many balls if he always leaves his attacker alone? In football they have always been handled total numbers: total goals, assists, steals… However, in this last decade statistics are taking more prominence. Now we study the percentage of successful passes, the percentage of shots on goal, … and even heat maps that indicate the area of play that the player has occupied throughout the game!
Moreover, the coaches themselves are educating themselves in reading statistics. This is what this newspaper indicated in an article dedicated to our coach and the importance of new technologies in their decision making.
However, football was born in old Europe and, as such, is reluctant to new changes. The player with the most goals in total is still giving the pichichi instead of the player with the most goals per minute. The total number of passes of each player is still counted and they count the same whether they are horizontal, forward or backward and the goalkeeper is the biggest passer of the team (as it happened with Victor Valdes in many matches of Barcelona de Guardiola). And, above all, it is still allowed to throw fouls to Cristiano Ronaldo although its success rate is ridiculous.
In this sense, American sport is marking the path of the statistical update, where any star is criticized if the numbers are unfavorable. But is American sport so different? How do you study the game beyond the puddle?
The statistical procedure in American basketball
Let's give an example of basketball. With their professionalization, decision making was objectified by seeking logical reasoning. For this, exact science, mathematics, was used.
Mathematics How is a player's performance valued without having to see it on the field? Next we will describe some of the different formulas that exist to assess the participation of a player on the track, from European simplicity to American accuracy.
Of course, in both (Europe and the US) there is a parameter that is recorded in the same way: the difference between points scored and points fitted by the team when the player is on the track. With this exception, the rest of the analysis use different mathematical expressions. In its description, we will use the following abbreviations:
Pts = points
Reb = rebounds
Thus = assists
Rob = robberies
Tap = plugs
F = Fouls
Pér = losses
TC = field shots
TL = free throws
… R = received
… C = tasks
… F = failed
… I = Attempted
… O = offensive
… D = defensive
… A = annotated
And we will start from the most elementary to the most sophisticated, from Europe to the US:
ASSESSMENT, Performance Index Rating or PIR (Europe): add the positive actions and the negative ones are subtracted:
Efficiency or EFF (USA): a formula similar to the valuation, but in relation to the games played, that is, reflecting a bit the player's history, its evolution:
Game Score (USA):
On this occasion, we observe that, except for the points scored, steals and turnovers (totally objective data), the rest of the parameters are multiplied by a correction factor, trying to adjust more to reality, to objectivity .
– Player Efficiency Rating or PER (USA): measures the performance per minute of a player. Once done, it adjusts to the rhythm of the team and normalizes (a mathematical word) to the league. It is such a complicated formula that it would not fit in this section, but we will highlight its virtues:
– By normalizing it to the league, the advantage of players in teams with faster attacks (more possessions in a match and more chances of doing good deeds) is equated with that of other slower teams.
– The average PER of the league is always set at 15, which is why it allows players from different ages to be compared on equal terms. In fact, there is a reference guide to evaluate a player's season:
* Wilt Chamberlain He holds the record with 31.82.
Although it is a fairly widespread term, let's clarify that the acronym MVP (Most Valuable Player) designate the most valuable player, qualification the player receives, or the players of a team, who have been most prominent in a whole championship, or in a specific competition.
– PER disadvantage: The defense does not have a great weight in this formula, so defensive players do not come out well
– There is no perfect statistic: there are methods that do not take into account the minutes played, passes well executed, blocks, defensive ability, intimidation, … Use a only measurement would be a mistake, because each one offers a different point of view, and studying all at the same time allows us to obtain a global vision of the performance of each player. In the end, statistics are simply data. They are a guide, they help us to get an idea … But the coach's subjective eye will always have the last word.
– The problem is that in football players are defined by their moves, while in basketball they are defined by their numbers. The archaic subjectivity versus mathematical objectivity.
– Mastering advanced statistics is mastering sport. But who will listen to me if I am a simple mathematician? Maybe we should give the ball less kicks and more hits on the Excel keyboard …
Subjectivity died when the sport became numbers. The dark clouds of opinions of archaic scouts gave way to the transparent clouds of data.
Diego Alonso Santamaría is a Mathematician. To ramble on the subject, write to: email@example.com
The ABCdario of Mathematics is a section that emerges from the collaboration with the RSME Disclosure Commission.
. (tagsToTranslate) statistics (t) sports (t) math