• Josh St. Marie

Using Mario Party to Win Your Draft

In Nintendo’s Mario Party, each character comes with their own unique six-sided die. Daisy, for example, has a die with four 3s and two 4s. While Dry Bones has a die with three 1s and three 6s. The two dice have a similar expected value (average) on any given roll: Daisy 3.3, Dry Bones 3.5. However, Daisy provides a much more consistent and safe role, while Dry Bones is a high-risk high reward choice. On average they may perform the same, but the two dice are fundamentally different. Understanding this simple concept is a key piece to successfully constructing a fantasy baseball roster in any format.

The analogy here mostly lies in a baseball player’s risk profile. If every potential outcome of a player’s season was put on a die, there are some players who would role the same number over and over again. These are your players with a safe floor but a limited ceiling (like Daisy). Then there are other players who oscillate between extremes, rolling high and low numbers but nothing in between. These are your boom or bust players (like Dry Bones).

I would like to extend this analogy one step further. When it comes to player evaluation, one helpful tool is projections. Projections give us an idea of what to expect from a player. In our Mario Party analogy, 3.3 and 3.5 would be our projections for Daisy and Dry Bones, respectively. Our projections rank Daisy and Dry Bones closely together, yet we know the two cannot possibly perform similarly. They are completely dissimilar. We know this because we are aware of all the potential outcomes for Daisy and Dry Bones. Unfortunately, we do not possess the same omniscience when it comes to baseball players.

One way to attempt to overcome our limited knowledge problem is to analyze a player’s historical performance and how it varies across seasons. To properly determine a player’s true potential outcomes though, we have to make adjustments for their situation, the ball (i.e. how juiced), their luck, etc. Just gathering the data we need to makes these adjustments can be a lot of work. On top of that, we would need a great deal of time to learn how to properly use that data, the best results even requiring years of refinement. We don’t have years. So then what approach are we left with? Do we ignore all the underlying data that informs a player’s true performance and settle for a crude estimate of a player’s variability? I say no. Never settle.

In fact, I have already mentioned the solution to our problem – projections. There are many well-respected projection systems, each of which takes a slightly different approach to determine the expectation for a player. Therein lies the beautiful simplicity of our solution. The different approaches naturally yield different results (potential outcomes), and, best of all, combining the projections together accounts for all those underlying factors we were going to ignore earlier. We begin to fill out each player's die.

I compiled projections from 5 different systems: ATC, THE BAT (THE BATX for batters), Razzball, Steamer, and ZiPS.

Then, I compared the projections for nine of the ten most common roto categories (BA, HR, R, RBI, SB, W, K, ERA, and WHIP). I did not use saves, as, for good reason, both THE BAT and ZiPS do not project them. Saves are probably the hardest statistic to predict, as they are almost completely determined by the situation. Seemingly every team has its own approach on how to handle saves. Worst of all, the strategy is often fluid – a new strategy every year (or even every month). This is why a new closer seems to pop up every week during the season.

Next, I calculated a Z-score for each statistic. Doing a straight comparison of the raw statistic doesn’t make sense. For starters, take this rather extreme example. One projection system thinks the league-wide batting average will be .200 and another thinks it will be .300. If those two projection systems believe a certain player will have a .250 batting average, it doesn’t really feel like the projections are saying the same thing. Further, when it comes to counting statistics, it is important to keep in mind the extremity of a prediction. Let me explain. Let’s say Projection System A predicts Player 1 will hit five home runs and player two will hit forty. Projection System B, on the other hand, says player A will hit six home runs and player two will hit thirty-nine. If we compare the differences, we end up penalizing the low prediction much worse than the high (a 20% overprediction vs a 2.5% underprediction). This happens when we fail to consider the two predictions are both equally extreme. Using a Z-score solves these problems for us.

After calculating the Z-Score, we simply measure the variability across the projection systems, convert the variabilities into scores between one and nine, and take an average of the scores to get an overall risk score. One being the safest of players. Nine being the riskiest. The results for the top 240 players by ADP are below (By the time you get to this point in a draft, you should have the base of your team, and risk is less important. You can just draft your guys).

Before you see the results, it is important to understand how to use the results. Because of the nature of the way I determined the risk score for each player, the distribution of scores follows a bell curve. One and nine are the rarest and most extreme, while five is the most common and can be thought of as the standard player. Thus, I would interpret the results as follows:

1 – Most Safe

2 – Very Safe

3 – Safe

4 – Slightly Safe

5 – Average

6 – Slightly Risky

7 – Risky

8 – Very Risky

9 – Most Risky

Remember, risk is not bad. After all, the higher the risk, the higher the reward. You want players with risk on your team because they have the highest upside. However, you cannot make a team of only risky players. You need to diversify your risk, mixing in players who are safe. This is the key to a great draft. Not only finding value but building a team with plenty of upside potential while limiting the downside. Making sure you have both upside and safe players, will set your team up perfectly to compete during the season. As championships are not won in the draft, only lost.

Also, I created an additional metric that scores the player based on their aggregate projections. I created this score by averaging the Z-Scores (instead of measuring their variability) and converting that to a score between one and nine. This is an important piece to the puzzle when determining a player's value, keeping in mind that the riskier the player the more they can deviate from their predicted value.

Disclaimers: Projections were last updated 1/20. ADP last updated 1/25. Unfortunately, Ha-seong Kim does not have a projection.

The SP Streamer Newsletter