Similarity-based learning is intuitive and gives people confidence in the model.
There is an inductive bias that instances that have similar descriptive features belong to the same class.
Similarity learning has a stationary assumption, i.e. the joint PDF of the data doesn’t change (new classifications do not come up). This assumption is shared by supervised ML.
Furthermore, an NN model can only give answers that are present in the training set. Ergo, is your training set representative?
Remarkably so. When I think of classifying things, my mind immediately goes to NN.