• Josh St. Marie

10 Negative Regression Candidates at Pitcher

Act 1 – The Set-Up

Tom Brady causes me pain.

Has Tom Brady caused you pain? I hope so because that was supposed to be relatable. It was meant to help us bond. I fear otherwise that what I am about to say might not go over so well. It is not as relatable as Tom Brady.

The other night I was playing with a baseball statistic I created. Yes, I do think "playing" is the best word. I told you this wouldn’t be super relatable. I don’t have a name for the statistic, K-BB-2*Barrel%, but it seems relatively descriptive. That’s cool and all, but I wanted to know if my friend (I mean stat) is at all predictive of future performance. I set out to prove that it might be. However, what was supposed to be an hour of research turned into a day of hard labor. Blood, Sweat, Tears. All three. I think. Truthfully, I blacked out. When I awoke from my trance, I was not faced with a predictive stat, but instead a list of pitchers whose ERA are certain to negatively regress during the 2021 season.


The intermission. The time to get up, stretch your legs, relieve yourself, and maybe grab a snack. However, for the more passionate, this is a time for discussion. What say you, are you passionate enough to stick around and discuss my methodology?

It is tough to nail down my exact methodology, as it all just kind of came together, but I’ll do my best.

  1. I gathered seven statistics for starting pitchers (min 100 IP) from each season from 2015 to 2019: ERA, K-BB%, K-BB-Barrel%, K-BB-2*Barrel%, FIP, xFIP, and SIERA.

  2. I ranked each pitcher by ERA, K-BB%, K-BB-Barrel%, and K-BB-2*Barrell%.

  3. To mirror FIP, xFIP, and SIERA, I assigned an ERA value for each K-BB% stat by matching the pitcher's rank in each stat category to the ERA with the same rank. For example, if a pitcher was 1st in K-BB%, 5th in K-BB-Barrel%, and 10th in K-BB-Barrel%, then each rank would be replaced with the best, 5th best, and 10th best ERA from that season, respectively.

  4. I compared the ERA change from one year to the next for each eligible pitcher.

  5. To find pitchers whose ERA might drop by at least 0.5 runs from one year to the next, I filtered the dataset based on the three pseudo ERA stats from step 3 as well as FIP, xFIP, and SIERA.

  6. The final step was to simply apply the filter to 2020. Of course, no pitcher threw for 100 innings in 2020, so the IP limit was dropped to 50.

I took the following precautions to try to ensure validity and avoid overfitting:

  1. I only looked for pitchers who experienced negative ERA regression. This decision was made because these statistics fare much better identifying negative regression candidates than they do positive. Also, I knew if I was going to look at 2020 with an eye towards 2021, I could only look for players who overperformed. It is too hard to do the opposite when you have to identify if a player was only bad because of the extenuating circumstances surrounding COVID.

  2. I limited the filtering to at most use 3 of the 6 statistics. The best method ended up using K-BB%, K-BB-2*Barrel%, and xFIP.

  3. I set a minimum of 40 (10 per year) for the sample of players remaining after the filter was used on the 2015-2019 dataset. If the filtering was 100% accurate, but only on 39 players, it was not considered. The best method filtered down to exactly 40 players.

  4. Not only did I seek to maximize the overall accuracy, but I also wanted to maximize the year over year accuracy. This was especially important considering the ever-changing baseball which heavily skews the data year over year. The best method's worst single-year had an accuracy of 71%.

Act 2 – The Payoff

Following the above steps resulted in a method that looked for:

  1. Pitchers whose K-BB% suggests an ERA increase of 0.6 or more,

  2. Pitchers whose K-BB-2*Barrel% suggests an increase of 0.4 or more,

  3. And pitchers whose xFIP suggests an increase of 0.7 or more

The results were astounding! The filter correctly identified 35/40 pitchers whose ERA would increase by at least 0.5 runs in the upcoming season. The average ERA change of those 40 pitchers was a shocking increase of nearly 1.5 runs. Only one pitcher out of the 40 saw an improvement in ERA, and it was a microscopic decrease of 0.01 runs. Overall, the method proved very successful!

We can apply these three criteria to pitchers' 2020 stats to identify which pitchers we can expect to see an increase in ERA in 2021. Identifying such pitchers is only half of the battle. As a fantasy baseball community, we aren’t half bad at doing that ourselves without the help of math. In fact, of the 10 pitchers I am about to identify, most will not shock you (a good sign that our list is a good one). Some of the players are already getting drafted with negative regression in mind, but others are still being overvalued.

To help us identify which pitchers we are properly assessing, I used Fangraphs's free auction calculator and ATC projections. As an additional data point, I also shifted each pitcher's ERA based on the median (50th percentile outcome) drop of the 40 pitchers identified by our filter. The median drop was 1.33. I then replaced the ATC projected ERA with the median shifted ERA and observed how that affected a player's value.

With all that in mind, here are the 10 pitchers I am pegging for negative ERA regression in 2021.

On the Bubble

Probably the two most surprising pitchers on the list, also happen to be the two closest to the bubble. I determined the bubble by seeing who joins and who leaves the list of negative regression candidates if I move each cutoff point in our criteria up and down by 0.1 runs. These are the only two pitchers who are close enough to a cut off point to be affected.

Wheeler is someone who I like, but it does appear that we are not quite factoring in his downside ERA risk. It should be noted that how a player is valued, shouldn’t completely dictate where we draft him. Even though Wheeler is valued around the 40th best pitcher, it still could make sense to draft him around the 30th pitcher. However, seeing Wheeler on this list gives me enough pause that I will probably seek to lower the number of shares I have of him. Although, since he is on the bubble, I am probably not avoiding him completely either.

I was disappointed when Yarbrough showed up on this list, as he is a player I am heavily targeting. Thankfully, even factoring in Yarbrough’s ERA risk, he still appears to be worth drafting at his current ADP. Even so, it is probably wise (this is 100% advice for myself) to wait for a little bit longer to draft him, instead of jumping the gun multiple rounds ahead of his ADP.

The Dodgers

I made the Dodgers their own category solely because their projections have not been updated since the Trevor Bauer signing. I am avoiding both of these pitchers. May stands to be negatively affected by the Bauer signing, and Urias, even if somehow unaffected by Bauer, is still being drafted too high considering his ERA is likely to rise.

Properly Valued

There is only one pitcher left who we seem to be evaluating properly. I’m not particularly a Chris Bassitt guy, but Bassitt proved to be an undervalued commodity last year (even if some of that appears to be an artificially high ERA). That could prove to be the case again, but I like where Bassitt is currently being drafted.

Probably Overvalued

ATC finds both of these players way overvalued, but the median outcome disagrees. In the case of Keuchel, I side with ATC. It doesn’t take a rocket scientist to see Keuchel’s sub 2.00 ERA and know it is going to increase in 2021. Because no pitcher can keep an ERA that low, Keuchel's ERA was already due for some negative regression, to begin with. I tend to think the median is undershooting the sizeable ERA shift Keuchel is bound to experience. I will be avoiding Keuchel until his ADP falls.

As for Keller, I am less sure. He is not going to have a 2.47 ERA, but valuing him at 144 seems a bit extreme considering his consistent track record. I don’t mind taking a shot on Keller as the 80th pitcher off the board, but, perhaps like Wheeler, it would be prudent to decrease the number of shares we have of him.

Definitely Overvalued

All three of these players are bound to experience negative ERA regression, and it would seem, we are largely unprepared for it to happen. Simply put, there is no way you will find me drafting either Davies or Wainwright. I won’t be quite as bold with Max Fried. There is probably a way I would draft him. The way, however, must be hidden in some secret passageway, because I don’t see it. There are too many pitchers I would rather take in in that range instead: Lynn, Burnes, Gray, Ryu, Carrasco, Strasburg, Hendricks, and Berrios. Hey! I’d take Fried over Lamet though (always got to end it the SP Streamer way).

P.S. Tom Brady won my favorite team 6 Super Bowls… I was just trying to be relatable!