The Dating Problem, Revisited.

Peijin Chen
4 min readFeb 25, 2021

How many dates should you go on before you settle down?

The classic “secretary” problem has many names and guises but it’s essentially a mathematical decision theory problem that goes like this: imagine you are interviewing a pool of candidates, and you want to find the best one, but you are under time constraints. The interviewees arrive in random order, and you evaluate each one with a “score” — but you have to decide if you want to hire them on the spot, or else pass them up and move to the next person. No backsies! So…what’s your algorithm for nabbing the best person? You can use this to model dating, of course, since you’re just uh, hiring someone to be your life partner.

The mathematicians’ answer to this is, rather magically, N/e, where N is the number of candidates (assuming you know this) and e is Euler’s number. This works out to 37% or so, meaning this is how many of the candidates you should go through to establish your “benchmark” —I call it the “burn-in” period — and once you do, then just pick the next person after that who is better than everyone you’ve seen in the burn-in period. This is called an optimal stopping problem, and the solution says that you can expect to come out with the best candidate 37% of the time. Of course, that’s assuming a uniform distribution for the candidates. I tamper with this assumption in the experiments I detail below.

The problem is that if time is money, then it’s not only the 37% of your time that you spend, but also the time it takes to actually find someone that meets the benchmark. I mean, maybe you got lucky and met the best person in the burn in period, an outlier type, and it will take awhile before you find someone of that caliber again. So let’s work in a cost. Let’s see how long it actually takes to find that person.

The code that I came up with for this problem can be found at the end of this post. Here’s the general setup that I came up with.

  • There are 100 candidates, with scores drawn i.i.d from a Poisson(lambda = 100) distribution. Generally, this means a fairly normal-shaped curve, with max scores ~ 126.
  • We can decide if we want our final candidate to be better than the best of the burn-in period — or if we would settle for someone that is, say, in the 90th percentile of the burn-in candidates.
  • We will “grid-search” through various burn-in periods and compute how long it actually takes before we select the final candidate.
Peijin Chen

Machine Learning Engineer