summaryrefslogtreecommitdiff
path: root/src/pages/blog/2021-08-18.md
blob: 941a51bc6a00cf273c1f24a6fafe0657df9c22cc (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
---
layout: ../../layouts/BlogLayout.astro
---
# Python :python: in real life
Recently I had a chance to solve an accounting problem. And as always, Python saved me some time and nerve cells.

## The problem
In accounting you sum things up a lot. And sometimes, you forget what were the terms you used to compute the sum, especially when there are a lot of them. This is exactly the scenario I faced recently:
 - There's a price list for certain items
 - There's a total sum that was made up from **some** of these prices
 - You have to find out what prices were used to calculate total sum

And this problem appears many times, same price list but different totals. I immediately knew I am not gonna do this manually, so first thing I did was creating a new script and writing my first two lines with the input data:
```python
prices = [561.6, 134.24, 561.6, 345.6, 2292, 22.08, 61.2, 87.41, 37.08, 37.8, 25.44, 246]
totals = [271.01, 1078.61, 3333.44]
```

## Thinking
Now, let's analyze the problem a bit. There are **12** prices in the list. Let's suppose our total was made up from 2 prices. It could be any 2 numbers from the list, in fact there are a lot of possible pairs. To be precise, there are **66** of them! Luckily Python has an amazing package `itertools` which allows you to work with combinatorics. We can now list all these pairs:

```python
from itertools import combinations


prices = [561.6, 134.24, 561.6, 345.6, 2292, 22.08, 61.2, 87.41, 37.08, 37.8, 25.44, 246]
pairs = list(combinations(prices, 2))
print(pairs)
```

Output:
```python
[(561.6, 134.24), (561.6, 561.6), (561.6, 345.6), (561.6, 2292), (561.6, 22.08), (561.6, 61.2), (561.6, 87.41), (561.6, 37.08), (561.6, 37.8), (561.6, 25.44), (561.6, 246), (134.24, 561.6), (134.24, 345.6), (134.24, 2292), (134.24, 22.08), (134.24, 61.2), (134.24, 87.41), (134.24, 37.08), (134.24, 37.8), (134.24, 25.44), (134.24, 246), (561.6, 345.6), (561.6, 2292), (561.6, 22.08), (561.6, 61.2), (561.6, 87.41), (561.6, 37.08), (561.6, 37.8), (561.6, 25.44), (561.6, 246), (345.6, 2292), (345.6, 22.08), (345.6, 61.2), (345.6, 87.41), (345.6, 37.08), (345.6, 37.8), (345.6, 25.44), (345.6, 246), (2292, 22.08), (2292, 61.2), (2292, 87.41), (2292, 37.08), (2292, 37.8), (2292, 25.44), (2292, 246), (22.08, 61.2), (22.08, 87.41), (22.08, 37.08), (22.08, 37.8), (22.08, 25.44), (22.08, 246), (61.2, 87.41), (61.2, 37.08), (61.2, 37.8), (61.2, 25.44), (61.2, 246), (87.41, 37.08), (87.41, 37.8), (87.41, 25.44), (87.41, 246), (37.08, 37.8), (37.08, 25.44), (37.08, 246), (37.8, 25.44), (37.8, 246), (25.44, 246)]
```

Here we converted the result of `combinations(prices, 2)` to a `list`, but that's actually only to be able to `print` it. In fact, the original result was a [generator](https://wiki.python.org/moin/Generators) - it wasn't storing the whole list in memory, only the instructions how you should iterate over it. And that's exactly what we want to do - iterate over this list, taking the sum and see if it matches the result, something like this:

```python
for prices_combination in combinations(prices, 2):
    if sum(prices_combination) == total:
        return prices_combination
```

# The solution
Of course, our `total` doesn't have to be a sum of 2 terms, it can be 3, 4, 5 or more. Also, `total` might not exactly equal the sum, there can be a slight difference due to rounding or human error. Let's say this difference is no more than `0.01`. This will be the final solution:
```python
#!/usr/bin/python
from itertools import combinations


prices = [561.6, 134.24, 561.6, 345.6, 2292, 22.08, 61.2, 87.41, 37.08, 37.8, 25.44, 246]
totals = [271.01, 1078.61, 3333.44]

def decompose(total, terms):
    for count in range(1, len(terms)):
        for combination in combinations(terms, count):
            if abs(total - sum(combination)) < 0.01:
                return combination

for total in totals:
    combination = decompose(total, prices)
    print(f'{total} = sum{combination}')
```

Which will produce a nice output in less then **0.03** seconds:
```python
271.01 = sum(22.08, 61.2, 87.41, 37.08, 37.8, 25.44)
1078.61 = sum(561.6, 22.08, 61.2, 87.41, 37.08, 37.8, 25.44, 246)
3333.44 = sum(561.6, 134.24, 345.6, 2292)
```

Notice that the only list stored in memory was the original one. Each combination only appeared in memory when it was needed to calculate the sum, and immediately disappeared after that. That's why it's (relatively) fast!

# Takeaway
Of course this script isn't by any means optimal! Essentially it's a brute force, we are just checking all the possible combinations. But we are doing it in a **clean and efficient way** (credit to generators).

When you are solving real-life problems, you are not in algorithms class and you don't have to write a hardcore algo, you just have to correctly utilize amazing toolkit that this language has. I've spend no more than 5 minutes writing this and I already have the answer to my problem - that's the beauty of Python :python:!

## Now imagine writing and compiling a C++ program for that!
I can only describe it with this image:

![meme](/brainlet-dreams-big-brain.png)

Programming languages are tools, and different situations require different tools.