Baseball or Soccer: Which is More Afraid of the Numbers?

“They hate what they don’t understand.”—Sean “Diddy” Combs

I’ve never been good at math. Or at least that’s the attitude I’ve carried with me since 1st grade. In elementary school, my parents sent me to a tutoring center two times a week. I’d do endless sheets of arithmetic problems for an hour, and then go home and do more. My mental math was on point, but it always took me longer than the rest of the class to “get it.” I needed individual attention, but was often times too ashamed to ask for it. To this day, I still can’t do long division problems.

When you grow up with an affliction towards numbers, you get nervous whenever they’re presented in a decision-making situation. Adding up the change in your pocket at the deli counter isn’t easy. Simple accounting problems are stressful when they shouldn’t be. Deciding whether Mike Trout deserves to be MVP based on something called WAR, or choosing between Luis Suarez and Robin van Persie by comparing Chance Conversion rates equates to rocket science.

Although I’m never excited to do a math problem, I enjoy analyzing sports statistics. In 6th grade I started carrying a Baseball Prospectus in my backpack. I would pour over the annual additions of the mammoth book in my spare time—the book felt as close to the truth about baseball as any analysis could be. Full of advanced baseball statistics and player projections, it felt like the end all be all of the upcoming baseball season. Why even bother with watching games? Basebsall Prospectus already projected them. In my thirst for the truth about baseball, the “outsider’s” knowledge and perspective found in the Baseball Prospectus books felt indisputable, and it was all coming from guys who had never been on a scouting trip.

I believe there are plenty of sports fans and writers out there who take their “I’m bad at math” attitude and flip it into a dismissal of baseball’s sabermetrics, and soccer’s opta statistics. People are just afraid of the numbers.

Sabermetrics, which is a term derived from SABR (Society for American Baseball Research), have endured a decade-long battle for acceptance in baseball’s mainstream consciousness, starting with their grand introduction through Michael Lewis’ 2003 bestseller Moneyball. (Not forgetting the two decades of work Bill James did before Moneyball was even drafted though.) Over the years, more telling statistics rooted in sabermetrics such as On Base Percentage (OBP) and On Base Plus Slugging (OPS) made their way into box scores and programs. These statistics are easy for any fan to understand and calculate, but still give more insight into a player’s performance than just batting average and home runs. More complicated sabermetrics were left for Baseball Prospectus books and blogs.

The crossing over of sabermetrics into the sporting mainstream peaked with a movie adaptation of Moneyball, and finally hit SportsCenter through the debate over the 2012 American League MVP award.

The 2012 AL MVP came down to two candidates: Los Angeles Angels rookie outfielder Mike Trout, and Detroit Tigers third baseman Miguel Cabrera. Through no fault of their own, each player symbolically represented two different schools of thought in baseball—two schools which were infamously pitted against each other in Lewis’ Moneyball.

Trout or Cabrera: Who should’ve won MVP?

In one corner were “old school” baseball traditionalists. These writers and fans believed that Cabrera was the natural choice for MVP, because in 2012, he was the first player since Carl Yastrzemski in 1967 to lead the league in batting average, home runs, and RBIs. That made him the first Triple Crown winner in over four decades, and in old school circles, a deserving MVP. He was the best player on a playoff team (Trout’s Angels failed to make the playoffs, despite only winning one game less than Cabrera’s Tigers), and achieved a season of historical proportions.

Across the debate were the nerds. Baseball’s statistical revolution, popularized by Moneyball, had revealed a bevy of telling metrics to analyze players with. The statistic at the center of the argument for Trout was Wins Above Replacement (WAR), which is a calculation for how many more wins a player contributed to his team than a “replacement level” player would’ve. According to FanGraphs, Trout posted a 10 WAR (the highest WAR by a center fielder since Willie Mays in 1964), meaning he was worth 10 more wins to the Angels than say, the Baltimore Orioles’ Mark Reynolds, who posted a WAR close to zero. Cabrera’s 6.9 WAR lagged behind both Trout and New York Yankees second baseman Robinson Cano, who’s 7.8 WAR was good for second. Much was made of Trout’s more complete impact compared to Cabrera. Trout stole 45 bases while Cabrera stole 4, and Trout’s fielding was regarded by observers and statisticians to be far superior to Cabrera’s. Cabrera may have been a better pure hitter in 2012, but Trout’s base running and fielding put him over the top.

Through a rounded statistical argument, it’s clear that Trout was a player overall player than Cabrera in 2012. But for many writers and fans, the debate started and ended with Miguel Cabrera’s Triple Crown win. The Triple Crown has been a distinction that’s become shrouded in mysticism and improbability. The likes of Barry Bonds, Albert Pujols, Alex Rodriguez, Ken Griffey Jr.—the preeminent hitters of my generation—have all failed to win the Triple Crown. Although there’s no physical trophy for the Triple Crown, it’s an “award” based on three statistics that mean less in a world of sabermetrics, and it’s still guarded in tradition, nostalgia, and Cracker Jack boxes. Much of American baseball’s popularity and interest comes from its history, record books, and old-time lore. The Triple Crown is a part of that, and for it—for baseball’s past—to be defended as a significant part of today’s game, Cabrera had to win the MVP. He ended up garnering 22 of the 28 first-place votes from the Baseball Writer’s Association of America. Trout got the remaining six.

“Call me old-fashioned but, if you win the Triple Crown and lead your team to the playoffs, you’re probably going to get my MVP vote.” —USA Today writer Jorge L. Ortiz

Besides, who the hell knows how to calculate WAR anyway? FanGraphs gives a fairly simple explanation of the statistic and how it’s calculated, but still—anytime you wander into unfamiliar and potentially complicated territory populated by guys like Nate Silver (the statistician who perfectly predicted the 2012 Presidential Election state-by-state) you’re going to be intimidated and hesitant to accept something new and different.

When the traditional box score statistics of batting average, home runs, RBIs, runs, and stolen bases are being challenged, marginalized, and perhaps overtaken by something so superficially convoluted as WAR and a whole host of other advanced metrics, writers and fans who didn’t grow up with these new statistics are going be resistant to adaptation. They’re scared of the numbers because they don’t fully understand them, and as seedy old writers, they can’t be bothered to change. These are writers who are covering a sport that didn’t implement instant replay for umpires until 2008, despite the NFL using it since 1986, and the technology for it existing since the 1960s. The league and its media are the furthest thing from progressive. Yet the is future covered in data, which can be difficult to sort through if you’re not inclined to embrace something you don’t fully understand.

The same reluctance to fully embrace advance statistics in baseball is currently being played out in professional soccer. Up until a few years ago, there were exactly five ways to quantify a player’s performance on the pitch. Goalies were judged by their saves and clean sheets, defenders by their tackles and clean sheets, and midfielders and forwards by their goals and assists. With only so many goals, assists, and tackles happening per match, it was difficult to gauge a player’s value. How could a player like Real Madrid’s Xabi Alonso, who plays in a deep-lying midfield role and doesn’t make many tackles, goals, or assists, have his impact quantified? There was no statistic for controlling the tempo of the midfield.

In the past five years, that’s changed. Opta, a sports data company founded in 1996, has seen its visibility skyrocket as access to their information has become more public. They track every movement in a match to sort out dozens of different statistics for players. They have an iPhone app so fans can observe Opta’s data, a website so fans can dig deeper into the data, and a Twitter account so fans can see the most preeminent data from match-day.

Now, stats like saves, tackles, clean sheets, goals, and assists no longer make up the entire profile of a player or club—they’re just parts of a bigger, more contextualized picture. EPLIndex.com gives subscribers access to 11 different types of statistics with further statistics within those types. For example, the statistics under the “Attacking” category are much more than goals and assists. Assists are cute, but they’re the RBI of soccer—it’s a statistic that’s dependent upon another player. They’re not an independent reflection of an individual player’s performance. A good pass into the right area must be made for an assist to be possible, but that pass is still dependent on another player to finish the movement and score the goal, and for the passer to get the assist. Why bother with assists when Attacking statistics like Chances Created and Clear Cut Chances Created truly measure the creativity of a player? Those two metrics give credit to the attacker for creating the chance even if the player on the other end of the pass doesn’t convert for a goal.

And is it enough to simply say that the Golden Boot winner is the best striker? What if it takes that striker takes an inefficient number of shots to score? According to EPLIndex’s database on April 12th, Liverpool’s Luis Suarez leads the English Premier League with 22 goals this season. It took him a league-leading 130 shots to get there, putting his Chance Conversion rate at 17% and Shot Accuracy at 48%. Meanwhile, Manchester United’s Robin van Persie has scored 20 goals, but only on 94 shots for a Chance Conversion rate of 20% and Shot Accuracy of 54%. While the goal count implies that Luis Suarez is the better striker, a more in-depth look at the opta statistics shows that van Persie’s is a more efficient, clinical striker.

When I first read Moneyball, it wasn’t the characters, stories, business methods, or introduction of sabermetrics that grabbed my attention. It was the notion of objective vs. subjective thinking in situations where it was possible to think objectively. There’s no objective way to discuss how good a hip-hop album is—your ears either like it or they don’t. It’s personal taste. But when it comes to sports, observing a player isn’t enough when objective statistics exist.

Liverpool midfielder Joe Allen may not win plaudits for his aesthetic play. Critics say he’s too small to be useful defensively, and he doesn’t to pass the ball forward enough to contribute to the attack. During any one of his matches this past season, it’d be easy to say that he had a bad game, because he doesn’t make the unlocking passes or the grinding tackles. He doesn’t catch the eye. But his statistics show that he’s 3rd in the EPL in Minutes Per Possessions Won, while boasting a 90% Pass Accuracy with 30% of his passes going forward. The eye doesn’t tell the whole story.

Much of the aversion to opta statistics by soccer fans and writers is due to an incorrect notion of the phrase “Moneyball” and what it means. Real Moneyball is when players within a market find inefficiencies in that market and exploit them for gain. In the book Moneyball, the Oakland Athletics, led by General Manager Billy Beane, saw that college prospects and fringe players who had good OBPs weren’t being valued as highly, so they exploited those two areas (among many others) to field a winning ball club.

When Liverpool were taken over by Fenway Sports Group (FSG), they were labeled by the media as soccer’s new Moneyball club. After all, FSG also owns the Boston Red Sox, who had won two World Series titles during General Manager Theo Epstein’s tenure. Epstein, a Yale University graduate, was known in baseball circles as someone in tune with sabermetrics and the ideals of Moneyball. FSG were seen as the owners to bring a Moneyball philosophy to Liverpool.

Shortly after buying the club in fall 2010, FSG appointed Damien Comolli as their Director of Football Strategy. In 2011, he signed Andy Carroll, Luis Suarez, Charlie Adam, Jordan Henderson, and Stewart Downing for over £100 million in transfer fees. Except for Adam and Downing, all of Comolli’s signings were under 23 years old—such a tremendous outlay of cash for young players raised a few eyebrows, especially the respective £35 million and £20 million fees for Carroll and Henderson (Carroll’s was the largest figure ever paid for a British player).

Moneyball Man Damien Comolli

Comolli’s new class of signings intensified Liverpool’s Moneyball label. There were two strands of logic behind the signings that connected them to Moneyball, albeit incorrectly. The first was the signing of young players for high fees in the hope that over the long-term, the lower wages and high performance of the players as they entered their prime would justify the price tag. Comolli signed Carroll, Suarez, and Henderson to be at the club (hopefully) for the next decade, and over a period of time thanks to debt amortization, those transfer fees would be seen as appropriate. Comolli was paying the price for future performance. He was treating his signings like a stock, buying early for a big payout down the road.

The second idea was the implementation of opta statistics to determine ideal transfer targets. In previous years, Liverpool had struggled to create goal-scoring opportunities. The signings of Henderson, Adam, and Downing were supposed to rectify Liverpool’s offensive woes. All three midfielders were in the Top 12 of the EPL’s Chance Creators from the previous season, and were all viewed as excellent passers and crossers of the ball. Their crossing, combined with Carroll’s outstanding heading ability (46% of his goals were headers—the second highest proportion in the EPL that year), and Luis Suarez’s dual threat as a creator and goal scorer should’ve made for an attacking juggernaut.

Although Comolli was exploiting exactly zero inefficiencies in the transfer market (promising young players and creative midfielders are always in consistently high demand. I’m not sure they’ve ever been out of favor), it was seen as a Moneyball approach, because he used financial techniques and opta statistics to decide on his signings. Writer Joe Hall for the popular website Sabotage Times wrote an article in April 2012 titled “Damien Comolli: Here’s Why The Moneyball Philosophy Was Never Going To Work At Liverpool.” It’s perhaps the finest example of the misunderstanding of Moneyball. He speaks on the film more than the book, and writes, “Football, however, is a vastly different sport to baseball and the sport is still some distance away from fully embracing the “moneyball method…to what extent can this model, of recruiting and deploying players based solely on statistical data, be applied to football?” (At that point in the article, I lit myself on fire.)

Through a misinterpretation of Moneyball‘s ideals, Moneyball suddenly meant using statistics to build a team—a gross oversimplification to say the least. In the case of the Oakland A’s, an undervaluation of certain statistical categories was the market inefficiency they observed. Because of that, statistics and Moneyball were lumped together. In a Bizarro Baseball World, that inefficiency could be quality scouting in a market dominated by only statistical analysis, but the principles of Moneyball would be the same. They’d still be exploiting an undervalued area for their benefit. Comolli did no such thing.

The Moneyball headline was further perpetuated when none other than Billy Beane sang his praises for Comolli’s work at Liverpool. In an interview with The Daily Mirror, Beane spoke about his friendship with Comolli, and defended the signings of Carroll, Henderson, and Co. With Mr. Moneyball himself publicly siding with Comolli, Liverpool was forever stamped as the Moneyball Club—a team built on statistics and clever accounting.

Comolli was fired seven months after the Beane interview. The season following his £100 million spending spree, Liverpool continued to struggle in front of goal, and languished to an 8th place finish. Comolli was lambasted for spending so drunkenly and failing to improve the squad, and soccer fans the world over instantly become skeptical of opta statistics. As it turns out, Carroll, Henderson, Downing, Adam, and Suarez, despite all of their previous metrics pointing to a new team full of creativity and goal-scoring ability, didn’t fit together tactically. The Chances Created statistic is useful, but unless the players are put in proper position tactically, they won’t be able to create. The supposed Moneyball Club built on numbers was undressed by tactical naiveté. The eye actually told more of the story than Liverpool paid attention to.

“You want to make sure you are getting more value than you are paying.”—Billy Beane on Comolli’s signings for Liverpool

Those who were initially skeptical of Comolli’s methods were vindicated. Don’t leave a number cruncher to do a football man’s job. Those who had admired Comolli (myself included) were left without a good answer—only tactical excuses. As was the case in baseball, soccer is now struggling to bring credibility to their own statistical revolution, because of the one-off failure of Liverpool’s falsely identified Moneyball Experiment.

In actuality, every EPL club uses some form of opta statistics and advanced data tracking to assess themselves and their transfer targets. They all employ statisticians and data analysts, but those departments are less visible than the one Comolli ran at Liverpool. Last summer, defending EPL Champions Manchester City open sourced all of their opta data from the previous season. Liverpool weren’t the only club to use opta statistics to build their squad—the title-winners were too, along with the other 18 clubs. On Opta’s website, they list the 122 soccer and rugby clubs they work with, including Barcelona, Liverpool, Chelsea, Borussia Dortmund, the MLS, the Italian and Dutch soccer federations, and all of the United Kingdom leagues.

In the media and in our own conversations, there’s been extreme deference to statistics in both baseball and soccer. Both are old games tied to the cultures and histories of the United States and Europe. Both have been analyzed using only a handful of statistics and subjective observation. Both are two decades too late in their current implementation of instant replay. Both are controlled by an old guard—soccer by FIFA’s fossils in Zurich, and baseball by the Writer’s Association’s nostalgic hacks.

Many of the fans of each sport grew up looking at the same statistics: goals and assists, home runs and RBIs. Progressive thinking and new ways to evaluate players were always going to be held back in two old timer’s games, but a breakthrough is inevitable. The 2012 AL MVP discussion brought attention to WAR and the logic behind it—soon enough, we’ll be seeing WAR on baseball cards and in programs. Although Comolli’s Liverpool failed, the negative perception it gave opta statistics can only last for so long, especially as successful clubs like Manchester City develop their public databases, and websites like EPLIndex and WhoScored? rise in viewership.

Nobody ever won an argument in a bar by opening up EPLIndex’s database and running through Joe Allen’s possession stats. It’s easier to yell “he’s crap” and move on. Listing the WAR and UZR of baseball players never decided a water cooler debate at work. Triple Crown numbers are more familiar. I prefer to let the WAR and opta discussion play out in the one place they actually matter: the field. It’s harder to fear the numbers when they mean wins and losses.

Follow Justin on Twitter @jblock49

6 thoughts on “Baseball or Soccer: Which is More Afraid of the Numbers?”

  1. I am going to preface what I am about to say by pointing out that my favorite team (Arsenal) are run by an economist and that my favorite soccer book is Soccernomics.
    However, statistics cannot be applied to soccer as they have in baseball. Opta is (mostly) great (they have problems with their possession stats). They give you all sorts of information, like which passes were completed, what runs were made, average position, etc. But what it does not provide is discrete statistics that can be accurately used to predict a player’s success. Baseball players attempt to hit the ball and get to base. Repeat. Soccer does not work that way. Because goals are so hard to get, one single mistake that a player makes in a game could have drastic implications, not just in the game, but in the season. If Sergio Aguero hadn’t made scored that final goal against QPR, they wouldn’t have won the championship. And what set up that goal was not a single discrete event. It was a series of actions made by each of the players on both of the teams. What the statistics cannot do is put context to the actions, something that the eyes can do.

    1. I don’t think the article disagrees with anything you said. I didn’t attempt to say that opta statistics are the direct equivalent of baseball’s sabermetrics—each just happen to be the most useful quantitative tools in their respective sports. Soccer has no sabermetrics. So you’re right: Baseball has more individual actions and nuances which can be measured and directly tied to outcomes. But I do think that soccer statistics actually provide indicators for success and context to the actions in the game. Obviously, a side that desires to play good pass and move football wants to have above average passing and possession percentages, especially in the final third.

      Your point about Aguero is well taken, but it still took a player to bring individual attacking skill to create the final ball to Aguero, and for Aguero to finish the chance, which Chances Created and Chance Conversion rate (two of my favorite stats) properly measures. Each one of the series of actions can be measured by a statistic to paint a larger picture of why the goal happened.

      Nobody can argue for statistics to replace standard scouting/tactical practices. The two must achieve a delicate confluence for success. Statistics and opta data can fill a void for objective analysis where the human eye and tactical mind can fail though, just as the human eye can make up the shortcomings of statistics.

  2. Although I like the odd stat every now and again, they do not show us the full picture. I prefer looking with the naked eye; it tells more. That stat about Joe Allen having the 3rd highest Possession Won Per Minute doesn’t do enough for me. For instance, what does it mean? For me it just means he picks the ball up in midfield. Lucas could make the tackle and disposses the player but because Allen wins possession his stats go up to make him look better than others. It’s a shoddy stat in my opinion. I pull out stats from everywhere in my academic research to suit my argument even if it doesn’t show the full picture.

    Playing in goal is a position where stats make no sense (Brad Friedal, Craig Gordon and Paddy Kenny recently discussed their displeasure of using stats to illustrate the quality of a GK). A shot is saved…this shot could come from anywhere. It could be the weakest shot in the world, it could be a shot that looks threatening but is blocked and rolls into the ‘keeper’s arms, it could be from a tight angle and so on – still it’ll go down as a save; he could have 12 weaks shots a game and have a 100% save percentage. Likewise, he could have 2 shots per game, one in the first minute and one after 75 minutes. He could have nothing do for 74 minutes but the two saves could be out of this world and the concentration needed for 74 minutes of nothing is what differentiates the average from the best. Still though, his save percentage remains at 100%. Looking at stats you may think the first guy is better but it’s all about the quality of the shot and the quality of the save – something the stat can’t tell but the eye can.

    Assists are subjective these days despite stats. It’s all hypothetical, but lets say Xavi or Pirlo dance around 3 men in midfield – the type of dancing we’re used to seeing, ping a 60 yard pass on the RB’s boot (who is in an advanced position) who then has to ‘only’ pass a 10 yard ball across the box for a ST to tap it in. Like a GK, where is Xavi’s or Pirlo’s pass included in the attacking stats? It more than likely isn’t. But I know how vital they were to the goal. They set it up on a plate to be scored but because their pass is 2-3 phases behind the goal it’s seemingly discounted. Then again their pass could have been a 2 yard pass to the RB, but is isn’t stated. It’s all a little confusing at times.

    I also wouldn’t like to say that football is afraid of stats or puts all of the blame on Liverpool. Quantitative analysis is all over Italian football and always has been so it isn’t a new thing – afterall, it’s about the result. Forget how you play, it’s a results business. Despite my love of Calcio, my philosophy is the complete opposite.

    A picture tells a thousand word, something a stat cannot do.

    1. Great comments. I think we can all agree that when properly used, statistics can aide the eye and vice versa. Neither approach is correct—it’s a matter of balance.

  3. Justin, you and I have discussed this on-and-off for around two-and-a-half years now, so I’m just gonna give my 2 cents here:

    If you remember Soccernomics, the authors highlight Olympique Lyon as a side that practices Moneyball tendencies in the market. Lyon believes that consistency in the boardroom, and not at a managerial level, is what is important for a side, and they have 12 main rules that they use in the market when buying & selling players:

    1) A new manager wastes money; don’t let him
    2) Use the power of crowds
    3) Stars of World Cups and European Champions are overvalued
    4) Certain nationalities are overvalued
    5) Older players are overvalued … as well as younger players (ideal age of buys: 23)
    6) Center forwards are overvalued
    7) Goalkeepers are undervalued
    8) Avoid sight-based observations.
    9) Sell any player if another club offers more than he’s worth.
    10) Have your replacement ready before you sell your best players.
    11) Buy undervalued players who have personal problems
    12) Help your players relocate

    Now, how many of these rules have Liverpool broken since January 2011? We can’t truly evaluate a few of these because we don’t have the facts, so let’s ignore rules 8, 9 and 12. If you go down the list, I think it’s safe to say that LFC have not adhered to a number of these, such as rules 1, 4, 5 and 6. Just from reading that list, you get the feeling that Liverpool is far from being a Moneyball club. In fact, we’re probably one of the furthest things from being a club that spends wisely, and maybe one of the most wasteful around.

Leave a Reply

Your email address will not be published. Required fields are marked *