Originally published February 28, 2021 @ 1:23 pm

In a nutshell, TOPSIS – the Technique for Order of Preference by Similarity to Ideal Solution – seeks out one of many options that is the closest to the ideal option while at the same time is the farthest from the worst possible option.

If you like math then you’re probably already familiar with the concept. If not, then a quick read on Wikipedia should get you up to speed. Or here’s a summary for a second-grader from OpenAI, if you’re feeling particularly lazy today:

A way to decide which of several things is better is to see how close each one is to what you want, and how far away it is from what you don‘t want.OpenAI, 'Summarize for a 2nd grader' model

But we’re not interested in complicated math. What we will be looking at is the Python3 TOPSIS module:

pip3 install topsis-jamesfallon

Let’s say you’re buying a house and you have several options to consider:

So you are looking to get a bigger house for less money, in a safe neighborhood with a decent school and a comfortable ride to work. You also have a dog and certain shopping preferences that are somewhat important to you.

Whatever you see in the “Importance” row of this table is purely subjective. On a decimal scale from 0 to 1, you estimate just how much you think you really care about this or that requirement.

The last row shows whether you’re looking to maximize or minimize the value of a particular requirement. Let’s say, 0 for the Price column means you would like to pay less for the house, obviously, while 1 for the School column shows that you would like to maximize the quality of education for your kids.

Converted to CSV after dropping the column headers and row descriptions, we have a file that looks like this:

248000,2412,1250,14,80,70,65,1.2,40
272000,2623,1430,12,43,75,68,0.6,33
302000,2744,1650,13,30,77,70,1.4,70
340000,3250,1812,14,23,80,73,1.2,67
380000,4200,2100,10,10,84,82,2.1,80
402000,5200,2400,6,20,88,84,0.3,55
0.8,0.6,0.65,0.5,0.6,0.75,0.8,0.6,0.5
0,1,0,0,0,1,0,0,1

The rest of the process is a bit of shell and Python scripting. We convert the CSV into a Shell array and then into an associative Python array. Why not just import the CSV directly into Python? Because I like Bash.

f=/var/tmp/spreadsheet.csv
c=$(
python3 << EOF
from topsis import topsis

$(for i in a w I; do
  case $i in
    a) unset a; q="["; cq="]"; mapfile -t a <<< $(head -n -2 "${f}");;
    w) unset a; unset q cq; mapfile -t a <<< $(sed 'x;$!d' "${f}");;
    I) unset a; unset q cq; mapfile -t a <<< $(tail -n1 "${f}");;
  esac
  echo -n "${i} = $q"
  for j in "${a[@]}"; do
    echo -n "[${j}], "
  done | sed -r "s/, $/$cq\n/g"
done)
decision = topsis(a, w, I)
decision.calc()
print (decision.optimum_choice)
EOF
)
echo -ne "Optimal choice is #${c}: $(sed -n ${c}p "${f}")\n"

The best option according to the TOPSIS algorithm is:

Optimal choice is #5: 380000,4200,2100,10,10,84,82,2.1,80

Not the most expensive option, a quick drive to work, good local shopping. A dog park is a bit of a hike, but you can get someone else to walk the dog. And if you disagree with the solution, you can always fiddle with the “Importance” ratings for your particular requirements.

Good luck with your house hunt!