python - hackinghat.com

We’re all Santas now

Secret Santa, if you didn’t know, is a gift-giving game that seeks to reduce the overall burden of buying gifts to a single gift for one person instead of a gift for every person. It also, usually, place limits on the spend which prevents people over or under buying.

For most people the solution to running a ‘Secret Santa’ is pretty simple but it’s worth repeating the process for clarity:

Write the names of your friends/family/colleagues on scraps of paper;
Fold the names of the Santas up and put them in your (hacking!) hat;
Each Santa takes it in turn to pull out a name from the hat;
If any Santa gets themself as their match return to step 2;
Buy gifts and exchange them sometime later.

The ‘secret’ part refers to the fact that no-one should know, who is their Santa or any other pairing. It usually also means that you never find out who bought you your gift, but that part is optional and there is often plenty of fun to be had discussing it afterwards.

Trouble at the North Pole

Problems start to arise when you can’t get all of the people who want to participate in the draw in the same physical room at the same time.

Your first thought might be (like mine was) “well surely there’s an app for that where you tap in all the names and the app privately sends out the matches”.

And you’d be right, there are plenty. But what if the person who is remote to the draw doesn’t have a usable mobile device or use email? I know this is hard to believe but these people do exist. And believe me, I have tried to fix this problem by fixing the bigger problem but progress has been slow. So, for now at least, apps and email are out.

Your second thought might be (like mine also was) “I could just use snail mail to send out the draw to the remote persons”. But the problem with this is that there is a chance, if one person or more is remote then one or more of those remote persons get themselves in that draw. This would reset the draw and mean we’d need to do it over.

So obviously I can fix that by sending out multiple draws and selecting the first one that works. Right? So the question then becomes how many trial draws do I need to be certain of a valid gift giving extravaganza?

99% Confident Santas

So how many draws do we need, for us to be 99% certain that those pre-draws will be successful and give us valid pairings? This problem description is a special case of the Binomal distribution which has the following general formula to find the probability of getting exactly r successes from n trials with probability p:

nCr . p^{r}.(1-p)^{n-r}

But what we actually want is the probability of 1 or more successes, which is the logical opposite statement of the probability of no successes at all. When you substitute that into the Binomial formula it simplifies to:

1-(1-p)^n

But what is p?

One way to estimate it is if you write out by hand the set of valid outcomes for different sizes of groups we can get an idea of what sort of values p takes.

n	Valid	Permutations (n!)	p (approx)
1	0	1	0
2	1	2	0.5
3	2	6	0.333
4	9	24	0.375

Table of probabilities of success in a trial of n

It turns out that this number oscillates as the group size increases but it looks like it’s converging to somewhere just above 1/3. In my case I really want to know p for a group size of 6 so a computer is needed to brute force that answer. This is of course in the absence of a specific insight on the nature of the ‘Santa sequence’ which would allow me to calculate the valid outcomes without brute force, but more on that in Part 2.

Turns out that in Python you can write an expression to do that in a one-liner but for clarity I’ve split it onto 3 lines.

>> import itertools, numpy, math
>> santa = numpy.array([1,2,3,4,5,6])
>> sum([1 for z in itertools.permutations(santa) if math.prod(z-santa) != 0])
265

What this does is to produce a Python iterable object which has all the different permutations of this array [1, 2, 3, 4, 5, 6]. If we use the order of the array to indicate the pairing (i.e. the first item corresponds to the first santa, the second item to the second and so on) then [1, 2, 3, 4, 5, 6] is clearly the worst draw since every Santa is paired with themselves.

By using itertools permutations then, we can use this abstraction to represent all possible draws of six Secret Santas as a list of arrays (spoiler the length of this list is the same as 6! which is 720).

But how do we check for a valid draw? One cheap way to know if a draw is valid is to subtract the index of each Santa from the index in the permutation given. If the draw is invalid the subtraction will put one or more zeros in the array, and when you find the product of that array it will be zero if it is invalid and non-zero if it is valid.

Doing this gives the new information we need, that there are 265 valid draws with a group of six people. Which is a p of about 0.3681.

Solved it! Now We Have Two Fails.

So now we want to know how many draws are needed to be 99% confident that one of the draws will succeed. We could simply use our simplified Binomial ‘at least one success’ formula to calculate the probability of success after n trials and then stop when we hit 99% but where is the fun in that?

The ScipPy optimiser is the sledge-hammer we need for this nut, it just needs us to rearrange our equation so that the answer we’re looking for would give zero when put into a Python function.

>> from scipy.optimize import fsolve
>> fsolve(lambda x: 1-(1-265/720.)**x-0.99, 1)
array([10.03406063])

So we need about 10 draws to be confident that it will only go wrong one time in a 100. But, when you think about the mechanics of putting together 10 independent Santa draws and stuffing them in the Royal Mail that feels like a lot of work.

Not only that, there is a much bigger chance that I will mess something up when putting the draws together and it results in a failed draw for a purely human reason (as opposed to a mathematical one).

Another Approach

Then I hit on a slightly different approach which is just a step further towards a generalised approach that could allow us to do as many draws as we want, but probably only when the number of Santas is low.

I now know how to generate all the permutations, and how to identify valid drawings easily, and also how many valid drawings there are (i.e. 265). This number is small enough that perhaps the easiest thing to do, in this case, is to send all Santas a personalised list that gives them their pairing in one of the 265 possible valid outcomes. Then all I need to do is to send out those personalised lists via post and email to all the Santas.

On the date of the draw then someone simply selects a random number from 1-265 and everyone reads their Santa off the list. Sure, as the organiser, I could find out who has which pairing but that’s not really a concern because I really don’t want to know. So I just won’t look.

Conclusion

I ran this process as a trial (to make sure everyone ‘got it’) and then we did it for real on 31st October 2021. I think it worked fine, but we might not really know until the gifts are given. Lol.

This puzzle raised a lot of other questions that I haven’t delved into in this post. In the next post I’ll provide a link to the code that generated the draw sheets that I sent out to my fellow Santas (in-case someone wants to use what I’ve done to run their own Secret Santa) but I’ll also delve into the sequence for p to see if we can answer questions like:

What does p converge to?
What if the number of Santas is bigger than 10?

But until then I’ve got some shopping to do. 🙂

Ok, so this is a little bit technical but it’s an intriguing puzzle that got me thinking quite hard. So here’s the problem. Sometimes investors want to be able to judge what the absolute worst case scenario would have been if they’d invested in something. Look at the following random graph of pretend asset prices:

Peak-To-Trough

You’ll see that there are two points on the graph (marked in red) where if you had invested at the first point and pulled out on the second point you would have the worst-case loss. This is the point of this analysis and is a way for investors in the asset to see how bad, ‘bad’ has really been in the past. Clearly past prices are not an indicator of future losses. 🙂

The upper one is the ‘peak’ and the lower one is the ‘trough’. Well, finding these two babys by eye is trivial. To do it reliably (and quickly) on a computer, is not that straight forward. Part of the problem is coming up with a consistent natural language of what you want your peak and trough to be. This took me some time. I believe what I really want is: the largest positive difference of high minus low where the low occurs after the high in time-order. This was the best I could do. This led to the first solution (in Python):


def drawdown(prices):
	maxi = 0
	mini = 0
	for i in range(len(prices))[1:]:
	   maxj = 0
	   max = 0
	   for j in range(i+1, len(prices)):
		if prices[i] - prices[j] > max:
		    maxj = j
		    max = prices[i] - prices[j]
	   if max > prices[maxi] - prices[mini]:
	   	maxi = i
		mini = maxj
	return (prices[maxi], navs[mini])

Now this solution is easy to explain. It’s what I have come to know as a ‘between’ analysis. I don’t know if that’s the proper term but it harks back to the days when I used to be a number-cruncher for some statisticians. The deal is relatively straight-forward: compare the fist item against every item after it in the list and store the largest positive difference. If this difference is also the largest seen in the data-set so far then make it the largest positive difference of all points. At the end you just return the two points you found. This is a natural way to solve the problem because it looks at all possible start points and assesses what the worst outcome would be.

The problem with this solution is that it has quadratic complexity. That is for any data-series of size N the best and worst case will result in N * N-1 iterations, in shorthand this is O(N^2). For small n this doesn’t really matter, but for any decently sized data-series this baby will be slow-as-molasses. The challenge then is to find an O(N) solution to the problem and to save-those-much-needed-cycles for something really important:


def drawdown(prices):
  prevmaxi = 0
  prevmini = 0
  maxi = 0

  for i in range(len(prices))[1:]:
    if prices[i] >= prices[maxi]:
      maxi = i
    else:
      # You can only determine the largest drawdown on a downward price!
      if (prices[maxi] - prices[i]) > (prices[prevmaxi] - prices[prevmini]):
	prevmaxi = maxi
	prevmini = i
      return (prices[prevmaxi], prices[prevmini])

This solution is a bit harder to explain. We move through the prices and the first part of the ‘if’ will find the highest part of the peak so far. However, the second part of the ‘if’ is where the magic happens. If the next value is less than the maximum then we see if this difference is larger than any previously encountered difference, if it is then this is our new peak-to-trough.

The purist in me likes that fact that the O(N) solution looks like easier code to understand than the O(N^2) solution. Although the O(N^2) solution is, I think, an easier concept to grapple with, when it’s translated into code it just doesn’t grok.

Recent Posts

Recent Comments

Archives

Categories