So I encountered a problem at work today that I thought was interesting.
It's this: generate a set of size n of integers, such that every subset of the set has a distinct sum. This way, given a sum you can know unambiguously which subset of integers of that set were added together to create that sum.
For example, [1,2,4] has this property: given any sum, you know which subset of [1,2,4] were added to create that sum.
The set [1,1,1] does not have this property: given the sum 1, you don't know which element created that sum.
The set [1,2,3] does not have this property: given the sum 3, it could be the sum of [1,2] or it could be the sum of [3]. So you can't tell unambiguously which subset was added up to get that sum.
So obviously one solution to generate a working set is just powers of 10: [1,10,100,1000,10000,..] So from any sum you can immediately work out which elements were added together to create that sum.
In a similar vein you can use powers of 2, since the numbers 1,2,4,8,16... are just 1,10,100,1000,10000 in binary. So from any sum you can work out which numbers were summed together to create it. I was not the one who came up with this solution, instead my colleague suggested it to me (I asked him the question and he answered with it immediately).
But we can actually do better than powers of 2. That was a surprising revelation for me. This problem is an open problem and it is described here: http://garden.irmacs.sfu.ca/op/sets_with_distinct_subset_sums
The OEIS sequence is here: https://oeis.org/A276661
|
a(0) = 0: {} a(1) = 1: {1} a(2) = 2: {1, 2} a(3) = 4: {1, 2, 4} a(4) = 7: {3, 5, 6, 7} a(5) = 13: {3, 6, 11, 12, 13} a(6) = 24: {11, 17, 20, 22, 23, 24} a(7) = 44: {20, 31, 37, 40, 42, 43, 44} a(8) = 84: {40, 60, 71, 77, 80, 82, 83, 84} a(9) = 161: {77, 117, 137, 148, 154, 157, 159, 160, 161} |
So for the set of size 4, you don't need [1,2,4,8] but in fact [3,5,6,7] also works.
And for the set of size 5 you don't need [1,2,4,8,16], [3,6,11,12,13] also works.
And for the set of size 6 you don't need [1,2,4,8,16,32], [,11,17,20,22,23,24] also works.
It is interesting that the last 3 digits are consecutive: [5,6,7], [11,12,13], [22,23,24], [42,43,44], [82,83,84], [159,160,161] and so on.
In fact the 4th largest is also only 2 away from the 3rd largest.
And the 5th largest is only 3 away from the 4th largest.
Anyway maybe that's only true of this list, I don't know, I haven't really looked into this.
But according to that Open Problem Garden the best lower bound is still exponential (around 0.2 * 2^n), so this method is clearly not that scalable if we're dealing with a large number of elements.
13 is not that much smaller than 16 and 24 is not that much smaller than 32. It's still exponential in the end.
But anyway, for small number of elements it is a tractable problem and powers of 2 are quite easy to work with.
Overall I am pleased to have learned about this problem and to have found a solution for it (even though I didn't come up with the solution myself)!!
I also wrote a Python program to find all the subsets of a given length:
https://github.com/1f604/questions/blob/master/programming%20questions/subset_distinct-sum.py
Output:
$ python subset_distinct-sum.py
[1, 2, 3] Fail. Duplicate: (3,) == (1, 2)
[1, 2, 4] Success!
[1, 2, 4, 7, 9] Fail. Duplicate: (9,) == (2, 7)
[1, 2, 4, 8, 16] Success!
[6, 9, 11, 12, 13] Success!
(11, 17, 20, 22, 23, 24)
No comments:
Post a Comment