During World War II, Allied forces readily admitted that German tanks were superior to their own. The big question for Allied forces, then, was how many tanks Germany was producing. Here's how they reverse-engineered serial numbers to find out.

To solve the problem of determining production numbers, the Allied forces initially tried conventional intelligence gathering: spying, intercepting and decoding transmissions and interrogating captured enemies.

Via this method, the Allies deduced that from June 1940 to September 1942, the German military industrial complex churned out around 1,400 tanks each month. That number just didn't seem right. To put that number in context, Axis forces used "only" 1,200 tanks during the Battle of Stalingrad, an eight month battle with a total of almost two million casualties. So that number of 1,400 was likely way too high.

Obviously skeptical of the above result, the Allies looked for other methods of estimation. And then they found a critical clue: serial numbers.

Allied intelligence noticed each captured German tank contained a serial number unique to the tank. With careful observation, the Allies were able to determine that the serial numbers had a pattern denoting the order of tank production.


Using this data, the Allies were able to create a mathematical model to determine the rate of German tank production, and estimated that, during the same summer 1940 to fall 1942 time period, the Germans produced 255 tanks per month — a fraction of the 1,400 estimate.

And it turns out, the serial number methodology was spot on: after the War, internal German data put der Fuhrer's production numbers at 256 tanks per month — one more than the estimate.

Here's the math:

Suppose one is an Allied intelligence analyst during World War II, and one has some serial numbers of captured German tanks. Further, assume that the tanks are numbered sequentially from 1 to N. How does one estimate the total number of tanks?

For point estimation (estimating a single value for the total), the minimum-variance unbiased estimator (MVUE, or UMVU estimator) is given by:

where m is the largest serial number observed (sample maximum) and k is the number of tanks observed (sample size). Note that once a serial number has been observed, it is no longer in the pool and will not be observed again.

This has a variance of:

so a standard deviation of approximately N/k, the (population) average size of a gap between samples; compare m/k above.


And that's how mathletes helped the Allied forces beat the Germans.

[Guardian, GongShow, Now I Know]

Photo Credit: damopabe @ Flickr