# By The Numbers: Leave Behind The Left On Base (LOB) Statistic

By Father Gabe Costa
» More Columns

The last time I taught a sabermetrics course, I was fortunate to have United States Army officer Second Lieutenant Christopher Anderson auditing my class. Holding a degree in Nuclear Engineering from West Point, Lieutenant Anderson is also an avid baseball fan. He is our guest blogger for this installment of By The Numbers.

Lieutenant Christopher Anderson: At 12:05 pm the Cubs took up their defensive positions against the Phillies at Wrigley Field on a sunny, 81° F Saturday afternoon. After two hours and forty-nine minutes, with little pomp and circumstance, July 3, 2010, was written into the annals of baseball history for the single-game record the Chicago Cubs set. That Saturday, the Cubbies left an unfathomable seventeen men on base—the most in a 9-inning game since 1919, the last date when data was available.

Major League Baseball requires all official scorekeepers to record the number of men left on base by each team at the end of each half-inning. The total number of men LOB “shall include all runners who get on base by any means and who do not score and are not put out [and]…batter-runner[s] whose batted ball results in another runner being retired for the third out,” according to rule 10.02(g) of the Official Baseball Rules. Now that we have defined the statistic of interest, I would like to point out that all statistics and records cited in this blog entry and in the accompanying files come from www.baseball-reference.com.

The contemporary belief about the team LOB statistic is that it reflects how well (or, more accurately, how badly) a team does at getting runners home—manufacturing runs—and thus how likely they are to win a game; the more runs you score, the better your chances of winning. This argument results from the logical fallacy post hoc ergo propter hoc: “We left men on base that did not score runs, and afterwards we lost the game; therefore, leaving men on base that did not score runs caused us to lose the game.” Many people then take the next step of utilizing the logical fallacy of denying the antecedent, which brings us to the idea that “if we do not leave men on base, we will not lose.”

Yankee manager Joe Girardi, for example, engaged in this doubly-fallacious (not to be confused with “phallic”) thought process after a 4-3 loss to the Royals on May 11th, 2011. He cited the 15 Yankees left on base over 11 batted innings as the main reason for the loss. However, I aver that after a 7-inning, 1-run performance from a starter, the reliever coming in and walking two of the first three batters and then getting taken to centerfield for a 2-out, game-tying RBI single that led to the extra innings, is the real area of concern.

So far, I have supplied you with some anecdotal and philosophical facts; now it is time to delve into the numbers. Before analyzing any numbers, I hypothesized that the team LOB statistic does not negatively correlate to winning or losing; nor runs scored, abbreviated RF for “runs for.” In addition, I hypothesized that team LOB positively correlates to winning or losing and runs scored. I reasoned that since you have to get men on base to score (with the exception of the long ball), the amount of men a team leaves on base is indicative of their ability to continually get on base, which eventually will result in cycling runners into home, thus scoring runs.

If my hypothesis was to be true, it needed to be true for all teams—good, bad, powerful, weak. Therefore, for my data, I found the best, worst, .500 Winning Percentage (WPCT), and the New York team from the AL in 2008. This query gave me the Angels, Mariners, Indians, and Yankees, respectively. For each game the team played that year, I compiled win or loss, home or away, runs for, runs against, run differential (RD), team LOB, and innings batted. “Win or loss” and “home or away” were treated as binary, assigning 1 to a win, 0 to a loss; and, 1 to home and 0 to away. Run differential was determined by subtracting “runs against” from “runs for”. Team LOB, runs for, runs against, and innings batted were manually entered. All four teams combined yielded 486 sets of data.

To determine how the data that I compiled correlated to each other, I had to find their correlation coefficient. Correlation coefficients range in value from -1 to 1. -1 indicates a very strong negative correlation (as stat A decreases, stat B increases), 1 indicates a very strong positive correlation (as stat A increases, stat B increases), and 0 indicates no correlation. The reader can find the equations here, or, you can install the (free) data analysis add-on to Microsoft Excel and have the program do it for you—as I chose to do. The results are as follows:

As one can see, the correlation coefficients are very close to zero for all of the compared statistics. The data appears to say that the amount of men a team leaves on base in a game does not correlate to whether a team is home or away, wins or loses, the amount of runs a team scores, or how many runs a team wins or loses by. To further validate my analysis of the data, I found, for each of the four teams, how many men each team left on base per nine innings batted, and what their winning percentage was for 2008. I eyeballed a graph of the data (albeit only four points) to see if I could see a correlation: I saw none! A copy of that graph and all of the data I compiled is located in the attached Excel file.

Supplemental Data

At the end of the day, my hypothesis turned out to be half right. Team LOB positively correlates only very-weakly to the amount of runs a team scores, and not at all to winning or losing. However, based on this data, I feel safe in saying that team LOB has no correlation to whether a team wins or loses or how many runs it scores.

Managers: please stop using Team LOB as a catchall excuse for losing.

Fans: please stop using Team LOB as a scapegoat for why your team lost.

And to everyone who made it to the end of this blog, I say “Congratulations!” much like then Cubs manager Lou Piniella must have said to his team that July afternoon. They defeated the Phillies 3-1.

#### One Comment

1. Tmac says:

Good blog, sir. I took a look at the excel, and noted how in depth it was. Since you undoubtedly have too much time on your hands, did you look into runners left in scoring position? Would be interesting to see what effect that has. Also, although it is incredibly frustrating to see your team leave runners on, it is a step in the “positive” direction that those runners did in fact get on base. Maybe teams with higher LOB also have a higher OBP…

Gonna miss headbutting with you in class…go SOX

1. Christopher Anderson says:

When I first got the idea to analyze the LOB statistic, I considered analyzing RISP as well. I chose not to because RISP brings into the conversation a much more complicated factor–what does it mean to be “in scoring position?” Is a player automatically in scoring position if he is on second or third base? Is Jorge Posada “in scoring position” when he is on second? In a lot of cases, I would say no. Is Brett Gardner or Jose Reyes “in scoring position” when he is on first? In many cases, I would say yes (especially in Citi Field). As to your point about the correlation between OBP and LOB, you may feel free to use my excel sheet and just create another columb for each team for game OBP and redo the analysis. However, if I had to guess, I would say that OBP would have a close-to-zero correlation coefficient. This is because, for any amount of people left on base in a given inning, there are an infinite number of OBP values (limit ->1) the team could have had. E.G. LOB:0, OBP:0 (3 AB, 3 K), LOB:0 OBP:.727 (11 AB, 8H (6 1B, 2 GS), 3 K)

2. Sean says:

I completely agree that only when all other things are equal is LOB of much value in determining who wins and who loses.

That doesn’t make it meaningless. As a kid, I was taught to score games for my father’s little league teams and one thing that he made me do was make sure both teams were keeping the correct order by checking the total plate appearances against the counts of runs scored, hits errors and LOB. The totals should always be equal. That statistic is meant to be another resource into score-keeping and is plenty useful in recreating the games from any time if you know how to read it. The author of the article is completely right, however, that it hardly matters on its own in determining who will likely win or lose.

1. Christopher Anderson says:

Sean,

You are completely right about using LOB to “prove” a box score. Rule 10.03 (c) of the Official Baseball Rules actually mandates that official scorekeepers do this. I don’t believe that at the inception of this statistic it was intended to be used as a performance metric; rather, as you stated, that it was intended to be used as a check on the scorekeeper’s math.

3. Brendan says:

Nice write-up on the LOB issue. I agree that it is a nearly meaningless stat. Please see my May 2008 research paper available on retrosheet.org on this subject (http://www.retrosheet.org/Research/BinghamB/Whats%20so%20bad.pdf). Examining a set of games only somewhat larger than the set considered here, I find a weak but positive correlation between runs scored and LOB. Moreover, I find that leaving more runners on than the opponents is more often than not associated with winning (albeit by a narrow margin).