An example of MLE giving poor estimate


Just came across this probability puzzle and try it on GPT, it uses MLE to solve it and gives a very poor estimate. The question goes like this

A bus company with bus route number starting from 1,2,\cdots,N. Say we came across one bus with the number 60. What is a good estimate of the number of bus routes N.

If we use maximum likelihood estimation, we can consider maximizing

p(O=60|N)=\frac{1}{N} assuming that the distribution is uniform

Since the observation O is 60, N has to be at least 60. Maximizing p(O=60|N) gives us 60.

This of course is a very bad estimate, intuitively, the chance of coming across the bus with the largest bus number is rather unlikely. And considering the uniform distribution, it is reasonable to assume that we come across the average N/2 instead. Consequently, since there are 59 numbers below 60, we expect N=60+59=119.

A similar trick can be used in a more complex case. Assume that instead of observing one bus, we observe 5 buses and the largest number among all is 60. What will be a good estimate of N now?

Since 60/5=12, we can consider the five numbers are uniformly spread to 12, 24, 36, 48, and 60. And thus N should be 60+11=71. This is a quick estimate but is actually quite good. When N=71, the average of the largest among 5 buses is actually 59.66, which can be estimated by the Lea code below

import lea

bus=lea.vals(*range(1,72))

bus1=bus.new()
bus2=bus.new()
bus3=bus.new()
bus4=bus.new()
bus5=bus.new()

lea.max_of(bus1,bus2,bus3,bus4,bus5).mean()

For arbitrary number of buses, we can compute the average with the function below

def average_max(N,d):
    T=0;
    for n in range(1,N+1):
        T+=n*(n**d-(n-1)**d)
    return T/(N**d)

where the code is quite easy to verify noting that the prob of getting max equal to n for d buses is \frac{n^d - (n-1)^d}{N^d}.

Copyright OU-Tulsa Lab of Image and Information Processing 2024
Tech Nerd theme designed by Siteturner