Robot Baseball — tuning the strike zone for maximum drama
puzzle
math
python
How I solved Jane Street’s October 2025 puzzle with one symmetry, a backward induction, and a 1D root-find.
Author
Victor S HUANG
Published
04 Nov 2025
Another writeup I owed myself. The October 2025 Jane Street puzzle asked:
Games are composed of a series of independent at-bats in which the batter is trying to maximize expected score and the pitcher is trying to minimize expected score. An at-bat is a series of pitches with a running count of balls and strikes, both starting at zero. For each pitch, the pitcher decides whether to throw a ball or strike, and the batter decides whether to wait or swing; these decisions are made secretly and simultaneously. (…) An at-bat ends when either the count of balls reaches \(4\) (batter receives \(1\) point), the count of strikes reaches \(3\) (batter receives \(0\) points), or the batter hits a home run (batter receives \(4\) points). (…) Let \(q\) be the probability of at-bats reaching full count; \(q\) is dependent on \(p\). Assume the batter and pitcher are both using optimal mixed strategies and Quad-A has chosen the \(p\) that maximizes \(q\). Find this \(q\), the maximal probability at-bats reach full count, to 10 decimal places.
I solved it and want to write it up properly. The puzzle has two layers: a 2-by-2 game played at every count, and a single dial \(p\) that the league turns to maximise drama. Each layer is short on its own — what makes the puzzle nice is that they fit together cleanly.
Reading the rules
At each pitch the pitcher picks a probability \(t \in [0,1]\) of throwing a strike, and the batter picks a probability \(\sigma \in [0,1]\) of swinging. The four cells of the resulting 2-by-2 give
pitcher batter
wait
swing
ball
balls \(+1\)
strikes \(+1\)
strike
strikes \(+1\)
home run with prob \(p\), else strikes \(+1\)
An at-bat is a Markov chain on counts \((b, s)\) with \(b \in \{0,1,2,3\}\) and \(s \in \{0,1,2\}\). It terminates when \(b = 4\) (walk, \(1\) point), \(s = 3\) (strikeout, \(0\) points), or a home run (\(4\) points).
Let \(V(b, s)\) be the at-bat value to the batter under Nash equilibrium play. Boundary conditions are \(V(b, 3) = 0\) and \(V(4, s) = 1\).
A symmetry that does most of the work
Write \(V_b = V(b+1, s)\) and \(V_s = V(b, s+1)\). Combining the four cells, the expected payoff at \((b, s)\) is
Stare at that for a moment. It’s symmetric in \(t\) and \(\sigma\): swap them and nothing changes. The pitcher minimises this and the batter maximises it, but they’re playing against the same expression. So at equilibrium
\[
t^\star = \sigma^\star.
\]
That single observation collapses two unknowns per state into one. The Jane Street solution calls it out as the trick: “the outcome of a pitch is symmetric with respect to the pitcher’s choice and the batter’s choice.”
Solving the per-state game
In a mixed-strategy equilibrium of a \(2\times 2\) game, each player picks the probability that makes the opponent indifferent between their two pure actions. From the batter’s side, the batter is indifferent between always waiting (value \(V_b\) if the pitcher’s pitch lands as a ball, \(V_s\) if a strike — overall \((1-t)V_b + t V_s\)) and always swinging (value \((1-t)V_s + t[(1-p)V_s + 4p] = V_s + tp(4 - V_s)\)).
Setting those equal and solving for \(t = \sigma\) gives
And the equilibrium value (just plug \(t = 0\) into the always-swing expression — both pure actions give the same answer by indifference) is the very tidy
(The three add to \(1\), as a sanity check.) Starting from \((0,0)\) with mass \(1\) and pushing forward through this Markov chain, \(q\) is just the mass that lands on \((3, 2)\).
Backward, then forward, in code
import numpy as npdef solve(p):"""Return q(p) and the equilibrium swing probabilities.""" V = np.zeros((5, 4)) V[4, :] =1.0# walk V[:, 3] =0.0# strikeout sigma = np.zeros((4, 3))for tot inrange(5, -1, -1):for b inrange(4): s = tot - bifnot (0<= s <=2):continue Vb, Vs = V[b+1, s], V[b, s+1] sg = (Vb - Vs) / (4*p - (1+p)*Vs + Vb) sigma[b, s] = sg V[b, s] = Vs + sg * p * (4- Vs)# forward: probability of reaching each state q = np.zeros((5, 4)) q[0, 0] =1.0for tot inrange(6):for b inrange(4): s = tot - bifnot (0<= s <=2) or (b, s) == (3, 2):continue sg = sigma[b, s] p_strike =2*sg - sg*sg*(1+p) p_ball = (1- sg)**2 q[b, s+1] += q[b, s] * p_strike q[b+1, s] += q[b, s] * p_ballreturn q[3, 2], sigma, Vq0, _, V = solve(0.5)print(f"At p = 0.5: q = {q0:.6f}, V(0,0) = {V[0,0]:.6f}")
At p = 0.5: q = 0.184180, V(0,0) = 0.919179
A quick sanity check at \(p = 0.5\) — half of swung-at strikes leave the park, which is way too generous for the pitcher. The batter’s expected score should be high, and reaching full count should be unlikely. Both come out as expected.
Optimising \(p\)
We want \(\arg\max_p q(p)\). It’s a one-dimensional problem and \(q\) is smooth on \((0, 1)\), so Brent’s method on the interior is more than enough.
To ten decimal places, \(q^\star = 0.2959679934\), attained near \(p^\star \approx 0.2269732\).
Monte Carlo gut-check
Before I trust either the recursion or the optimiser, I want to play a few hundred thousand at-bats at the optimal \(p\) and the equilibrium \(\sigma\), and see the empirical full-count rate land on \(q^\star\).
rng = np.random.default_rng(20251004)def simulate(p, sigma, n=400_000): full =0for _ inrange(n): b = s =0while b <4and s <3:if (b, s) == (3, 2): full +=1break sg = sigma[b, s] t = rng.random() < sg # pitcher throws strike with prob sigma sw = rng.random() < sg # batter swings with prob sigmaifnot t andnot sw: b +=1elif t andnot sw: s +=1elifnot t and sw: s +=1else:if rng.random() < p:break# home run s +=1return full / n_, sigma_star, _ = solve(p_star)print(f"Empirical q at p* : {simulate(p_star, sigma_star):.4f}")print(f"Analytic q at p* : {q_star:.4f}")
Empirical q at p* : 0.2962
Analytic q at p* : 0.2960
The two agree to within Monte Carlo noise. Good — both halves of the solver are honest.
A picture of the optimum
import matplotlib.pyplot as pltps = np.linspace(0.01, 0.99, 200)qs = np.array([solve(pp)[0] for pp in ps])fig, ax = plt.subplots(figsize=(7, 5))ax.plot(ps, qs, color="#1f77b4", lw=2)ax.axvline(p_star, color="black", lw=0.8, ls="--", alpha=0.7)ax.axhline(q_star, color="black", lw=0.8, ls="--", alpha=0.7)ax.plot([p_star], [q_star], "ko", ms=6)ax.annotate(f"$p^\\star \\approx {p_star:.4f}$\n$q^\\star \\approx {q_star:.10f}$", xy=(p_star, q_star), xytext=(p_star +0.15, q_star -0.04), arrowprops=dict(arrowstyle="->", color="black", lw=0.8),)ax.set_xlabel("home-run probability $p$ on a swung strike")ax.set_ylabel("probability of reaching full count")ax.set_xlim(0, 1)ax.set_ylim(0, max(qs) *1.1)ax.grid(True, alpha=0.3)plt.savefig("thumbnail.png", dpi=110, bbox_inches="tight")plt.show()
Figure 1: Probability \(q(p)\) of an at-bat reaching a full count, under optimal mixed-strategy play. The maximum sits at \(p^\star \approx 0.227\) with \(q^\star \approx 0.2960\).
The shape makes sense. As \(p \to 0\), batters never bother swinging and pitchers always throw strikes, so at-bats end in three pitches (strikeouts) — full counts are rare. As \(p \to 1\), every swung strike is a home run; batters swing freely and at-bats end fast for the other reason. The sweet spot for length sits in between, just below \(p = \tfrac{1}{4}\).
What made it nice
The trick was the \(t \leftrightarrow \sigma\) symmetry. Once you spot that, every state collapses to a one-variable indifference equation, and the value recursion fits in a few lines of code. Wrapping it in a 1D optimiser over \(p\) does the rest.
The official Jane Street solution follows the same backward-induction-plus-1D-search and lands on the same answer, \(0.2959679934\).