Sports are all about winning and losing. Ask most people who the best team in baseball was in any given year and they will tell you who won the World Series. We associate winners as being the best because we interpret baseball as being a game of pure skill, but what if luck played a much larger role in the final result of a game than most people realized. What if the difference between a win and a loss came down to a gust of wind, a bad call by an umpire, or a left fielder standing three feet too far to the right. What if a small amount of luck meant the difference between winning and losing the World Series. More people are realizing everyday that, when it comes to the sport of baseball, this is an incontrovertible fact of the game. In order to understand more about how this phenomenon plays a factor, hardcore baseball fans and statisticians have spent the last 20 years devising statistics that attempt to separate luck from skill.
When we talk about luck and baseball, it's important to recognize that skill is still the most important factor of the game. You can't throw a random obese gentleman from the stands up on the pitcher's mound and expect him to throw a winning game from luck alone. Luck is merely a condition of the game that causes variance. This is a phenomenon that is very familiar to poker players. Just because someone has an edge in poker due to skill doesn't mean that they are going to win every hand or session. Variance will inevitably lead to long losing streaks when a bunch of bad luck piles up in a short amount of time. Likewise, a baseball team with a statistical advantage from skill will still find themselves losing in streaks, but over the long run, both the poker player and the baseball club will find themselves ahead. Poker players can calculate their true skill stats with something caller ROI, or return on investment. Baseball fans and club owners try to track this through a statistics system called sabremetrics.
Almost all fans of baseball are familiar with a group of core statics that have been part of the game since the days of Walter Johnson and Ty Cobb. For batters, the most popular stat is AVG, or batting average. For pitchers, it's ERA, or earned run average. These stats were relied on for decades as a concrete indicator of player performance. Unfortunately, both of these statistics are misleading when it comes to determining the true value of a player. That's because they miss some of the key components of the game, namely luck. When a pitcher throws a ball, or a batter hits a ball, they no longer have control of the action. Identical pitches and hits can have drastically different outcomes depending on the fielding, ballpark, weather and a whole host of other factors.
Sabermetrics attempts to correct for this luck variance by developing a new set of statistics that accounts for secondary variables that traditional stats have long ignored. Nearly every aspect of a baseball game can cause variance in the outcome. Are there a lot of people in the stands? Does the umpire have a large or small strike zone? What's the distance to the foul polls in right and left field? Is the outfield real grass or fake turf? All of these factors play a role in the outcome of a game. Some are more important than others. The goal of a sabremetric stat is to eliminate all of these outside influences and come up with a number that accounts for player skill alone. This can be a difficult task, and there is no current standard that everyone endorses.
Three of the organizations responsible for producing the most popular sabremetric numbers in the game today are Fangraphs, Baseball Prospectus and Baseball Reference. Fangraphs is a great online source for sabremetric numbers that you won't be able to get from baseball's official sources., and it's the one that will be used as a reference for the rest of this article. If you want to see extensive charts with a large range of sabremetric numbers, you can't do much better at the moment.*
Batting average has been a standard for hitting performance for several reasons: it's easy to calculate, easy to remember and draws a distinct line between the best and worst hitters. Batters with an average over .300 are considered good to great hitters while those under that mark are considered mediocre to weak. Batting average leaves a lot of important information out of the picture, however. How many home runs, triples and doubles did the player hit? How many times did they walk? How many times did they strike out? In order to satisfy these questions, a new statistic needed to be created that showed things like power and ability to get on base.
If you've seen the movie "Moneyball," you already know how Billy Beane shifted his focus to on base percentage in lieu of traditional scouting measurements. The conventional wisdom at the time was that getting a hit was superior to getting a walk, hence the focus on batting average. In reality, however, a base runner is a base runner, and since the goal of the game is to score runs, it shouldn't matter how a batter gets on base. The more a hitter gets on, the more likely they will be to score a run. Batting average is, of course, an important component of on base percentage, but isn't there a huge difference between a single, a triple and a home run?
OPS stands for on-base plus slugging, and slugging is meant to calculate a hitter's ability to produce runs through power. It's calculated in nearly the same way that regular batting average is, but doubles count as two, triples as three and home runs as four. Let's say you had a hitter with a batting average of .250, but every time he got a hit it was a home run. His slugging would be 1.000, showing you how valuable he truly was as a hitter. OPS itself doesn't directly account for many luck factors, but combined with other sabremetrics, it's an essential stat for analyzing hitter performance.
BABIP stands for batting average for balls in play. It calculates the percentage of balls hit in play that manage to be singles, doubles or triples. Home runs, walks, hit-by-pitches and strikeouts are all ignored for this calculation. In terms of hitting, this metric provides two valuable pieces of information. Hitters with a high lifetime BABIP are usually fast, are good line-drive hitters or hit for a lot for power. Hitters who experience an abnormally high or low BABIP when compared to their lifetime average are usually going through a period of good luck in the case of it being high, or bad luck in the case of it being low. Why is this the case? Once the ball leaves the hitters bat, he has no control over what happens to it. A variety of factors, including luck of the wind, fielding ability and ballpark variances will help determine if he gets a hit or gets out. Some players have everything go there way while others get nothing but bad breaks. BABIP helps us see who those players are.
The league average for BABIP is .300, and a vast majority of players fall within .290 and .310. There are exceptions for players with exceptional abilities like Ichiro, who has a career BABIP of .359, but those kind of hitters are rare. BABIP in any given season or group of games is a good indication of how lucky or unlucky a player is. Let's use Andrew McCutchen from the Pittsburgh Pirates as an example. In 2012, Andrew finished the season with a .327 batting average, but his BABIP was one of the highest in the league at .375; this suggests that he had an especially lucky season. This data is supported by the fact that his career average up until that point was only .290, nearly forty points lower than his 2012 numbers. His home run number, while admirable at 29, wasn't that much better than many previous years. All this suggests that McCutchen was a very lucky hitter in 2012.
How about the unlucky hitters of 2012? Let's use Adam Dunn from the Chicago White Sox. Dunn has never been a player that hits for average. His value as a hitter has always come from his power, specifically his ability to hit 30 plus home runs in a season. In 2012, he hit 41, a great number, but his batting average was only .204, 36 points lower than his career average. Did Adam Dunn become a worse hitter, or did he just get unlucky a lot? Dunn's BABIP was .246, the second lowest in the league. His career BABIP is .286; the data suggests that he had a very unlucky season. The defense was in the right place at the right time for more than usual in 2012.
Along with the win-loss record, ERA is the most cited statistic for pitchers. It measures the number of earned runs per 27 outs, or nine innings. The reason that this statistic is mostly meaningless comes from the flawed concept of base runners being earned or unearned. An earned run is counted when a player scores a run after getting on base by a hit, a walk or hit-by-pitch. A run is unearned when the player got on base by an error. The problem with this comes into play when you look at how errors are determined. An error is scored when a fielder touches the ball but fails to make the routine out. What happens if the fielder is slow? Is that now the pitcher's responsibility?
Baseball statisticians tried to rectify this situation by producing a statistic that only accounted for things that were in the pitchers control. This has resulted in one of the more popular pitching stats, K/BB. This is the ratio of strikeouts to walks, two things that are considered to be completely determined by pitcher skill. People realized quickly, however, that something was still missing. Many great pitchers in history haven't been strikeout pitchers, but pitchers who got batters to hit wimpy ground balls that were easy for infielders to scoop up for the out. K/BB doesn't account for this kind of skill.
Sabremetrics came up with a group of statistics to account for pure pitching performance known as DIPS, or defense independent pitching statistics. It eliminates luck factors like fielding performance, positioning and ballpark idiosyncrasies from the numbers. At first, this method of evaluating pitching was controversial, but once the luck factor was proven statistically through BABIP, which is discussed below, traditional baseball fans started coming on board. The most common sabremetric pitching stat lines are DICE (Defense-Independent Component ERA) and FIP (Fielding Independent Pitching). Both statistics are calculated with three important numbers: home runs, walks and strikeouts. These are the three stats that pitchers have complete control over, so they represent the clearest picture of a pitcher's skill.
BABIP, or batting average for balls in play, is a measurement for pitchers that determines the batting average of opponents who manage to get the ball in play not counting home runs. Traditionally, pitchers have been judged on a statistic known as opponent average, but once again, that stat doesn't account for factors that are not in control by the pitcher. We already looked at how BABIP is a good metric for luck and skill type of a hitter, now let's look at what those numbers tell us about a pitcher's performance over a give period of time.
The league average for pitcher BABIP is .300, matching the number for hitters. The difference is that sabremetric proponents believe that variance produces a much more profound effect on pitching performance. A five or ten point swing in the wrong direction can produce a losing season, and a slight upward swing can produce a career high ERA. Abnormally high or low BABIPs are considered unsustainable. Pitcher's experiencing a streak of good luck are said to regress towards the average as the season progresses. Some pitchers, however, have entire seasons of better than average luck.
Let's use Jered Weaver of the Los Angeles Angels as an example. Weaver has been a high-performing starting pitcher throughout his career in the major leagues, but 2012 was an especially great year for him. He won 20 games, lost only five, and posted an ERA of only 2.81, forty-five points below his career average. His BABIP for the season was .241, his lowest outside A+ ball. While his career BABIP is an impressively low .271, 2012 proved to be especially lucky. His defense independent stats, like strikeouts, home runs and FIP, were average to below average compared to his career numbers. If sabremetric factors for luck are correct, his numbers will likely regress in 2013 and subsequent years.
Fielding statistics are perhaps the most underutilized in baseball. The only statistic that has become a part of major stat lines is fielding percentage, which represents a percentage of balls fielded against errors. This stat misses major factors that influence what makes a good fielder and what makes a bad fielder. If you're only measuring whether or not the player bobbles the ball after touching it, you're missing skills like arm strength, speed, range and positioning. One group of sabremetrics minded individuals saw flaws in existing evaluations, including the Gold Glove Award, so they came up with Fielding Bible Award for the best players at each position every season.
Alternative fielding stats are difficult to come up with because they are not supported by a large amount of traditional data. Unlike pitchers and hitters, fielders are only scored when they make an error. If they make amazing plays, cover a lot of ground and produce more outs than any replacement player would, that information would be lost in traditional stat lines. Because of this, fielders need to be evaluated through more subjective criteria. There are several different ratings systems out there that account for a large range of variables and skills, including TotalZone, UZR and Fan's Scouting Report.
UZR, or ultimate zone rating, is one of the more popular sabermetric measurements of fielding performance. It compares the result of any given play with the history of balls hit in the same way to the same location. UZR divides the ball field into groups of zones for each player and determines if they performed above or below average for their given position. Plays made in some zones are given a higher value than others because the out is harder to make. The player with the highest ranked UZR in 2012 was Jason Heyward, the right fielder for the Atlanta Braves. On the low end was Rickie Weeks, second baseman for the Milwaukee Brewers who lead the National League for errors for several years.
Looking at a hitter's AVG and pitcher's ERA, we've learned, isn't a great way to evaluate whether someone is helping his team win games. In fact, no single statistic will do a very good job at determining whether a player is having a positive or negative impact on his club. A player may be a decent hitter, but his fielding costs the team more outs than he is worth. Sabremetrics attempts to account for every factor on the field that contributes to winning through a statistic known as WAR. This stat is formulated to evaluate pitchers and position players equally. WAR stands for Wins Above Replacement, meaning the amount of wins the team can expect to gain from the player in comparison to a replacement level player, like someone from the minor leagues or bench.
WAR is a non-standardized statistic, so depending on who is calculating it, you can get drastically different values. In all cases, however, the calculation is supposed to account for lucky or unlucky seasons, including pitchers having a higher than average BABIP. Almost every single player in the MLB during the 2012 season got a WAR score of 0 or better from FanGraphs. The player with the highest WAR was outfielder Mike Trout from the Angels. Trout was also named Rookie of the Year. He finished the season with an OPS of .963, 30 home runs and a battering average of .326. His BABIP, however, was .383, suggesting that he had an unusually lucky season. That's not to say he won't be an all star player for many years to come, but 2012 may prove to be one of his highest performing years.
Unlike football, basketball, hockey and soccer, there are no standardized measurements for a baseball stadiums outside the diamond. That's why you have the green monster at Fenway and the historic 500-foot center field wall at the old Polo Fields. Some ballparks are advantageous to pitchers while others favor the chance of the long ball. Coor's Field in Denver, for example, is known as a home run hitters park because of the altitude. Petco Park, home of the San Diego Padres, was known as a pitcher's park until 2013 when the outfield fences were brought in closer to home plate. These factors greatly influence one of the primary measurements of a pitchers skill, how many home runs they allow. They can also make a difference between doubles and triples, or outs and extra-base hits.
Batting Park Factor is a statistic that attempts to show how luck of the ballpark influences the amount of runs scored in any given venue. A park factor of 100 would mean that the ball park isn't statistically different when it comes to run production than the average of all ball parks seen on the road. A score higher than 100 means that the team's home park is more advantageous to hitters. A park with a score lower than 100, conversely, would be known more as a pitcher's park. Since certain parks may be more advantageous to some than others, this score is calculated individually for each player
Ballparks are also a factor because they give the home team a slight advantage. With all other things being equal, the home team has a 53 percent chance of winning on their home field. With the split between home and away games being equal during the regular season, this advantage may not matter much before October; although, it may be advantageous to face more difficult inter-league teams at home than away. This advantage means more in the playoffs because the league that wins the All-Star Game gets scheduled more home games.
Have you ever wondered why dominant teams like the Yankees sometimes reach the playoffs and get swept in the division series to a team that barely made the wildcard spot? The reason this happens is luck. Over the course of 162 game season, luck factors balance out for most teams, and you get a pretty good idea about which clubs are good and bad based on their win-loss record. In the playoffs, however, everything is reset. It doesn't matter if your ace won the Cy Young award if he has an unlucky game at the start of the series. It doesn't matter if the Yankees win over a hundred games during the season if they lose just three in the division series. In statistical terms, the playoffs would have a sampling error due to too small of a sample size.
You can observe this phenomenon by looking at a small sample size of games during the regular season. In 2012, the Washington Nationals had the best record in baseball during the regular season with 98 wins, 64 losses and a .605 winning percentage. This means, on average, they won three games for every two loses. In May, however, the Nationals got swept by the Marlins, the worst team in the NL East, in a three game series. In June, they lost seven out of ten games towards the end of the month. You can easily see how a great team can lose to a mediocre team in a five or seven game series when you factor in bad luck. A sport like basketball doesn't have this same problem because the sample size is drastically increased by the tremendous amount of scoring opportunities available in each game. This is why the Heat's chance for a championship is much higher than the National's.
The luck factor in baseball shouldn't discourage anyone from being a fan or rooting for their favorite player. Just because luck has a huge influence on the sport does not mean that talent and skill don't come in to play. Like the house edge in a casino, a great player gives his team greater odds of winning a game than other players do. The casino isn't going to win every spin of roulette, but over the long term they will always come out a winner. The same kind of statistical concept can be applied to hitting and pitching. Luck will make sure that Jered Weaver won't win every game and Adam Dunn won't hit a home run every at bat, but those players being on the field will increase the chances of their club winning ball games.