Single embedding of 'n' numbers is enough to describe a human face in such a complex way that it can be distinguished from the other people living on our planet. Surprisingly, 'n' can be relatively small to keep information about identity where the haircut, face mask, or glasses do not fool a face recognition system built on top of it. The question is how to train a model that produces such universal embeddings? We will explain historical approaches, show hidden pitfalls of metric learning, and present the principles of current SOTA methods.