To classify something, find things that are similar and label it with the same class as the most similar thing.
The feature space is \(Nd\), where \(N\) is the number of features. Each instance is mapped to a point. The descriptive features become the axes.
The Similarity Metric
Mathematically, it must conform to these 4 criteria:
 Nonnegativity: \(f(a, b) \ge 0\)
 Identity of Indiscernables: \( f(a, b) = 0 \iff a = b \)
 Symmetry: \( f(a, b) = f(b, a) \)
 Subaddivity (Triangular inequality): \( f(a, b) \le f(a, c) + f(c, b) \)
The Minkowski Distance
Where \(a\) and \(b\) are the instances being compared. \(m\) is the number of descriptive features:
$$ f(a, b) = \left( \sum_{i=1}^{m} abs(a[i]  b[i])^{p} \right)^{1/p} $$
Because the differences are raised to the \(p\)‘th power, higher \(p\) emphasizes features with big differences.
As \( p \to \infty \), we get the biggest absolute difference since it dominates the other sums. \(p = \infty \) is referred to as the Chebyshev (or Chessboard) Distance.
The Manhattan (or TaxiCab) Distance (\( p = 1\))
$$ f(a, b) = \sum_{i=1}^{m} a[i]  b[i] $$
The Euclidean Distance (\( p = 2\))
Interesting to note that “Euclidean Distance” is somewhat a misnomer. Given that Minkowski distances satisfy metrics criteria, they seem to assume a Euclidean (flat) geometry. In that sense, the Manhattan Distance is not, strictly speaking, a noneuclidean distance.
Euclidean space is the defacto geometry in ML for convenience/tradition rather than by design, which limits us. For example, hyperbolic space offers a better model for embedding a tree in a way that maintains the distance of nodes, i.e. the volume grows exponentially in the tree depth.
$$ f(a, b) = \sqrt{ \sum_{i=1}^{m} (a[i]  b[i])^2 } $$
Euclidean is the default, unless one is constrained by computational resources.
Note that the Euclidean distance is more influenced by a single large difference in one feature than a lot of small differences across a set of features. The opposite is true of the Manhattan Distance.
References

Into the Wild: Machine Learning In NonEuclidean Spaces. Fred Sala; Ines Chami; Adva Wolf; Albert Gu; Beliz Gunel; Chris Ré. dawn.cs.stanford.edu . Oct 10, 2019. Accessed Dec 18, 2020.
Why are nonnegativity and triangular inequality important? It seems that we should think of the similarity metric as a measure of “distance”. Distance between two instances obeys the nonnegativity & triangular inequality conditions.