By The Numbers: A Pitcher's MVS (Most Valuable Stat)

By Father Gabe Costa
» More Columns

Mr. Nathaniel Einfeldt is sabermetrics student and guest blogs for this installment of By The Numbers. He gets pretty technical in this piece, as he literally writes "by the numbers".

Nathaniel Einfeldt: Baseball can be compared to boxing; it is a constant one-on-one slugfest between the pitcher and batter. They trade jabs, crosses, and then a batter might hook one out to left field. The constant struggle between these players makes me wonder what the true value of a pitcher is. He is the one who makes the initial move in this fight. However, what makes a pitcher great? How are pitchers valued in MLB? While consistency is important, what single statistic can explain how teams value a pitcher?

To determine the value of a pitcher, I analyze how they are paid. My sample size consists of 53 starting pitchers from 2007-2010 seasons. I standardize my data by looking at only starting pitchers. I consider a "starting pitcher" as one who has pitched 50 innings and started 10 games in that respective season. I analyze the player's season right before he becomes a free agent, then I use the salary he receives to measure how much teams value certain statistics. My sample size will have flaws due to other factors being excluded. I observe WHIP, ERA, IP, HR, SO, AGE, WLP (Win/Loss percentage), LHP vs. RHP, NL vs. AL, team, and of course, salary. I do not factor the overall "productivity" of that pitcher. Nevertheless, the closest season to the free agency shows how good that player is currently. My analysis uses the basic concepts of Econometrics and applies them to find the stat that will return the highest salary per statistic.

Hakes & Sauer's Moneyball article discusses how OBP is the most important factor for a hitter to contribute to his team's win total. Because WHIP is the measure of how often a pitcher allows runners to get on base, it is the chief determinant of how much the GM wants him. The chart below shows how regressing the salary by a variety of factors can change the salary of the pitcher. For instance, in the Naïve Model, if the pitcher's WHIP increased by .1, his salary will decrease by 37.7%. The first row is the actual change in the salary that can occur. A positive number would raise the player's salary as it is increased. The second row, explained in (…), is the standard error associated to the coefficient. The third row, explained in […], is the significance of the variable. For the Full Regression, the t-value is -3.56, anything above 1.96 is considered significant because the coefficients allow for 5% error from the regression, as seen as the tails of a normally distributed curve. Running the regression on the natural log (ln) on Salary shows how a change in the variable will cause a percentage change in the salary of the player. The R-squared value for my full regression was .61, this means that over half the variance of the salary can be described with my explanatory variables. The Constants are the starting points of the salary, but are more so important, in making sure the regression results are stable, and are consistent.

Regression Results on ln (Salary)

Variables	Naïve Model	Model 2	Model 3	Model 4	Model 5	Full Regression
WHIP	-3.773 (.588) [-6.42]	-3.164 (.556) [-5.69]	-2.637 (.641) [-4.11]	-2.663 (.674) [-3.95]	-2.755 (.757) [-3.64]	-2.702 (.759) [-3.56]
IP(Season Total)	--	.007 (.002) [3.57]	.004 (.003) [1.22]	.004 (.003) [1.21]	.003 (.003) [.82]	.004 (.003) [1.33]
SO(Season Total)	--	--	.006 (.003) [1.67]	.005 (.004) [1.52]	.007 (.004) [1.91]	.005 (.004) [1.36]
Home Runs(Season Total)	--	--	-.004 (.011) [-.34]	-.004 (.012) [-.33]	-.003 (.012) [-.25]	-.010 (.013) [-.76]
Age	--	--	--	-.003 (.022) [-.14]	.002 (.012) [.11]	-.010 (.013) [-.03]
Year 2008	--	--	--	--	-.295 (.289) [-1.02]	-.209 (.319) [-.66]
Year 2009	--	--	--	--	.222 (.271) [.82]	.232 (.270) [.86]
Year 2010	--	--	--	--	-.095 (.255) [-.37]	-.021 (.265) [-.08]
If Player is in National League	--	--	--	--	--	-.232 (.217) [-1.07]
If Pitcher is a Right Handed Pitcher	--	--	--	--	--	-.221 (.225) [-.98]
Number of observations	53	53	53	53	53	53
R-Squared	.45	.56	.59	.59	.61	.63
Constant	20.5043 (.7964) [25.74]	18.57851 (.89734) [20.70]	17.814 (.6411) [17.88]	17.951 (1.412) [12.71]	17.897 (1.569) [11.41]	18.250 (1.600) [11.41]

This data shows that the WHIP is the most important factor in determining a player's salary. A player's ERA is excluded in these models, because it is so highly correlated to WHIP and this creates a problem of almost two identical terms. I ran the regression with ERA instead of WHIP, because ERA was never as significant as WHIP is in the Full Regression. One can use these numbers to determine a pitcher's salary depending how well he does. All the other statistics are, relatively speaking, considered insignificant, so it does not matter as much about any statistic other than WHIP. It makes sense, because some of these terms are influential to WHIP (as a strikeout, for example, would help contribute to a lower WHIP).

In conclusion, teams care about the pitcher's ability to limit base runners. It is not necessarily the amount of runs, which may seem against common logic. I was limited in my search due to the lack of information of salaries on certain players. There are definitely intangibles that a player can offer his team, but a General Manager would best find a pitcher with the lowest WHIP to maximize the efficiency (win percentage) of their ball club. That is what the data proves.

Works Cited

Hakes, Jahn H & Sauer, Raymond D (2006). "An Economic Evaluation of the Moneyball Hypothesis", Journal of Economic Perspectives, 20(3), 173-185.

Baseball Encyclopedia of MLB Players. (2011). In Baseball-Reference. Retrieved February 13, 2012, from http://www.baseball-reference.com/players/

Featured Local Savings

More from CBS News

Astros avoid being swept by Yankees

Rangers beat Hurricanes to take 3-0 series lead

Soto, Judge, Stanton homer in same game with Yankees in win over Astros

Young stroke survivor shares what symptoms to watch for