Robot Javelin — counter-exploiting a one-bit information leak

puzzle

math

python

How I solved Jane Street’s December 2025 puzzle with one indifference equation, one threshold shift, and a single derivative.

Author

Victor S HUANG

Published

04 Jan 2026

Another writeup I owed myself. The December 2025 Jane Street puzzle asked:

Two robots throw a javelin head-to-head. Each makes a first throw drawn uniformly from \([0, 1]\), then privately decides whether to keep it or take a fresh second throw (also uniform on \([0, 1]\), must be kept). Higher final distance wins. Historically robots play the Nash equilibrium of this game. But Spears Robot has a one-bit leak on you (Java-lin): they receive a bit telling whether your first throw was above or below a threshold \(d\) of their choosing, and they pick \(d\) to maximise their winning probability assuming you still play Nash. They don’t know that you know about the leak. Adjust your strategy to maximise your winning probability. Give the answer to 10 decimal places.

I solved it and want to write it up properly. The puzzle is a clean exploit/counter-exploit story. There’s a fair-game Nash baseline, then one player gains a bit of information, then the other player learns of the leak and shifts a single threshold to claw the edge back.

The Nash baseline

Each robot’s strategy is a single number: a rethrow threshold \(t\). Keep the first throw if \(X \ge t\), otherwise take a fresh throw. (Why a threshold? Holding the opponent’s strategy fixed, the value of keeping \(x\) is increasing in \(x\) while the value of rethrowing is constant — so the keep/rethrow decision flips at one cutoff.)

If a robot uses threshold \(t\), its final score \(Y\) has CDF

\[ F(y;\,t) \;=\; \begin{cases} t\,y & y \le t,\\ y(1+t) - t & y > t. \end{cases} \]

(Below \(t\), the only way to land at \(y\) is a rethrow, with weight \(t\); above \(t\), kept first throws contribute too.)

In a symmetric Nash equilibrium both robots use the same \(t\), and the player at the cutoff is indifferent between keeping and rethrowing. The win probability from keeping a score of exactly \(t\) is \(F(t; t) = t^2\). The win probability from rethrowing is \(\int_0^1 F(r; t)\, dr = (1 - t + t^2)/2\). Setting them equal,

\[ t^2 \;=\; \tfrac{1 - t + t^2}{2}\quad\Longrightarrow\quad t^2 + t - 1 = 0\quad\Longrightarrow\quad t = \varphi = \tfrac{\sqrt{5} - 1}{2} \approx 0.618. \]

The golden-ratio conjugate. At Nash both robots win half the time.

Spears’ leak

Spears picks \(d\), then gets the bit \(b = \mathbb{1}[X_J > d]\) before deciding whether to rethrow. Under the assumption that Java-lin plays Nash with threshold \(\varphi\), the most informative choice is \(d = \varphi\) — the bit then exactly distinguishes “Java-lin rethrew” from “Java-lin kept”. Any other \(d\) leaves Spears uncertain about whether Java-lin’s final score is uniform on \([0,1]\) or on \([\varphi, 1]\), which is the only thing the bit can resolve. (You can verify this by writing out Spears’ win probability as a function of \(d\) and checking that \(d = \varphi\) is the maximum; the derivative changes sign there.)

So under Spears’ (mistaken) belief, after observing \(b\) Java-lin’s final score is

\(b = 0\): \(Y_J \sim U(0, 1)\) (Java-lin rethrew).
\(b = 1\): \(Y_J \sim U(\varphi, 1)\) (Java-lin kept).

Spears now picks a rethrow threshold \(r_b\) for each value of \(b\), again by indifference.

\(b = 0\). Java-lin’s score is uniform on \([0,1]\). Keeping \(x_S\) wins with probability \(x_S\); rethrowing wins with probability \(1/2\). So \(r_0 = 1/2\).

\(b = 1\). Java-lin’s score is uniform on \([\varphi, 1]\). Keeping \(x_S \ge \varphi\) wins with probability \((x_S - \varphi)/(1 - \varphi)\); rethrowing wins with probability \((1 - \varphi)/2\). Setting them equal,

\[ r_1 \;=\; \varphi + \tfrac{(1-\varphi)^2}{2} \;=\; 1 - \tfrac{\varphi}{2} \;=\; \tfrac{5 - \sqrt{5}}{4} \approx 0.691, \]

using \(\varphi^2 = 1 - \varphi\).

Java-lin’s counter-shift

Now Java-lin secretly knows the leak. Spears is committed to thresholds \(r_0 = 1/2\) and \(r_1 = 1 - \varphi/2\), regardless of what Java-lin does. Java-lin picks a new threshold \(T\) to maximise its win probability. Crucially, the bit Spears receives is still \(b = \mathbb{1}[X_J > \varphi]\) (because Spears chose \(d = \varphi\) before this game began).

Suppose \(T \le \varphi\). Java-lin’s behaviour splits into three intervals:

\(X_J\) in	Java-lin does	Bit \(b\)	Spears’ threshold
\([0, T)\)	rethrow	\(0\)	\(r_0 = 1/2\)
\([T, \varphi]\)	keep	\(0\)	\(r_0 = 1/2\)
\((\varphi, 1]\)	keep	\(1\)	\(r_1 = 1 - \varphi/2\)

The middle row is the trick. On \([T, \varphi]\) Java-lin keeps a score that the Nash version would have thrown away. Spears, still believing Java-lin played Nash, expects a uniform rethrow from these \(X_J\) values and sets \(r_0 = 1/2\) — leaving Java-lin’s actual score in \([T, \varphi]\) pleasantly above Spears’ rethrow target. The leak says “Java-lin rethrew”, but Java-lin didn’t.

Let \(F_S(y; r) = ry\) for \(y \le r\) and \(y(1+r) - r\) for \(y > r\) be Spears’ CDF when using threshold \(r\). Java-lin’s win probability is

\[ W(T) \;=\; T \cdot \underbrace{\int_0^1 F_S(r; r_0)\, dr}_{\text{Java-lin rethrows}} \;+\; \int_T^{\varphi} F_S(x; r_0)\, dx \;+\; \int_\varphi^1 F_S(x; r_1)\, dx. \]

The first integral evaluates to \((1 - r_0 + r_0^2)/2 = 3/8\). The other two don’t depend on \(T\). Differentiating,

\[ W'(T) \;=\; \tfrac{3}{8} \;-\; F_S(T; \tfrac{1}{2}) \;=\; \tfrac{3}{8} \;-\; \bigl(\tfrac{3}{2}T - \tfrac{1}{2}\bigr) \quad (T \ge \tfrac{1}{2}). \]

Setting \(W'(T) = 0\) gives

\[ \boxed{\,T^\star \;=\; \tfrac{7}{12}.\,} \]

And \(7/12 \approx 0.583 < \varphi\), so the assumption \(T \le \varphi\) holds. Below \(T = 1/2\) the marginal benefit of nudging \(T\) up is positive too (the integrand is \(r_0 \cdot T = T/2 < 3/8\) for \(T < 3/4\)), so \(7/12\) is the unique maximiser.

Solving in code

from sympy import Rational, sqrt, simplify, integrate, Symbol, Piecewise, nsimplify

x, T = Symbol("x", positive=True), Symbol("T", positive=True)
phi = (sqrt(5) - 1) / 2
r0  = Rational(1, 2)
r1  = 1 - phi/2

def F_S(y, r):
    return Piecewise((r*y, y <= r), (y*(1 + r) - r, True))

# Win prob with Java-lin threshold T (assumes 1/2 <= T <= phi):
W_expr = (
    T * Rational(3, 8)
    + integrate(F_S(x, r0), (x, T, phi))
    + integrate(F_S(x, r1), (x, phi, 1))
)
W_star = simplify(W_expr.subs(T, Rational(7, 12)))
print("Closed form:", nsimplify(W_star))
print(f"Decimal:    {float(W_star):.12f}")
print(f"Submission: {float(W_star):.10f}")

Closed form: 229/192 - 5*sqrt(5)/16
Decimal:    0.493937090365
Submission: 0.4939370904

Tidied up, the closed form is

\[ W^\star \;=\; \frac{229 - 60\sqrt{5}}{192} \;=\; 0.4939370904\ldots \]

Just below \(1/2\). Spears’ bit was worth more than Java-lin’s counter-shift could fully recover, but the gap is tiny — the Nash baseline of \(0.5\) is almost reached.

Monte Carlo gut-check

Before I trust the recursion, I want to play a few million games at \(T = 7/12\) against Spears’ fixed thresholds and watch the empirical win rate land on \(W^\star\).

import numpy as np

rng = np.random.default_rng(20251204)
phi_f = float(phi)
r0_f, r1_f = 0.5, float(r1)
T_f = 7/12

N = 2_000_000
XJ = rng.uniform(0, 1, N)
XS = rng.uniform(0, 1, N)
RJ = rng.uniform(0, 1, N)
RS = rng.uniform(0, 1, N)

YJ = np.where(XJ >= T_f, XJ, RJ)
bit = XJ > phi_f
r_used = np.where(bit, r1_f, r0_f)
YS = np.where(XS >= r_used, XS, RS)

wins = float((YJ > YS).mean())
print(f"Empirical win prob: {wins:.6f}")
print(f"Analytic  win prob: {float(W_star):.6f}")

Empirical win prob: 0.493868
Analytic  win prob: 0.493937

The two agree to within Monte Carlo noise, which is a relief — small mistakes in the bit-conditioning logic are easy to make.

A picture of the optimum

import matplotlib.pyplot as plt

def F_S_num(y, r):
    return np.where(y <= r, r*y, y*(1+r) - r)

def W(T):
    rethrow_win = (1 - r0_f + r0_f**2) / 2
    xs = np.linspace(T, phi_f, 400)
    mid = np.trapezoid(F_S_num(xs, r0_f), xs)
    xs = np.linspace(phi_f, 1, 400)
    top = np.trapezoid(F_S_num(xs, r1_f), xs)
    return T * rethrow_win + mid + top

Ts = np.linspace(0.30, phi_f, 200)
Ws = np.array([W(T) for T in Ts])
T_star = 7/12
W_num = float(W_star)

fig, ax = plt.subplots(figsize=(7, 5))
ax.plot(Ts, Ws, color="#1f77b4", lw=2)
ax.axvline(phi_f, color="grey", lw=0.8, ls=":", alpha=0.8, label=r"Nash $\varphi$")
ax.axvline(T_star, color="black", lw=0.8, ls="--", alpha=0.7)
ax.axhline(W_num,  color="black", lw=0.8, ls="--", alpha=0.7)
ax.plot([T_star], [W_num], "o", color="#d62728", ms=7)
ax.annotate(
    f"$T^\\star = 7/12$\n$W^\\star \\approx {W_num:.10f}$",
    xy=(T_star, W_num), xytext=(T_star - 0.18, W_num - 0.012),
    arrowprops=dict(arrowstyle="->", color="black", lw=0.8),
)
ax.set_xlabel("Java-lin's rethrow threshold $T$")
ax.set_ylabel("win probability")
ax.legend(loc="lower right")
ax.grid(True, alpha=0.3)
plt.savefig("thumbnail.png", dpi=110, bbox_inches="tight")
plt.show()

Figure 1: Java-lin’s win probability as a function of its rethrow threshold \(T\), against Spears’ fixed exploit. The red dot marks the counter-exploit optimum \(T^\star = 7/12\); the grey line shows the Nash threshold \(\varphi\) for reference.

The shape makes sense. Pushing \(T\) down from \(\varphi\) keeps more of the band \([T, \varphi]\) — exactly the band Spears mis-reads as a uniform rethrow — and that pays until \(T\) drops below \(7/12\). Past that point, the kept scores are too weak to beat Spears’ \(r_0 = 1/2\) often enough, and the gain from extra band-keeping turns negative.

What made it nice

The trick was the asymmetry of belief. Spears’ threshold \(d = \varphi\) is locked in by Spears’ assumption that Java-lin plays Nash; Java-lin can break that assumption without Spears noticing, and the bit Spears receives turns into mild misinformation on the interval \([T, \varphi]\). One derivative picks the right \(T\), and the rest is integrals.

The official Jane Street solution follows the same path and lands on the same answer, \(0.4939370904\).