Efficient Large Dimensional Self-Organising Maps with PyTorch | by Mathieu d’Aquin | Dec, 2024


Because it’s fun to self-organise

Towards Data Science

Self-organising maps (or Kohonen maps) are an interesting kind of neural networks: they don’t follow the same kind of architecture and are definitely trained differently from the usual backpropagation methods. There is a good reason for this: they are meant to be used for unsupervised learning. They are to the usual multi-layer neural networks what K-Means is to SVM. They create clusters; they discretise the data space. But they have one thing that makes them different from other clustering methods: The clusters that they create form a map of the data (a grid of clusters) where the distance between clusters in that map represents the distance that exists between the average members of those clusters in the data space.

Because they are slightly atypical, there has not been as much work done on creating efficient implementations of self-organising maps (SOMs) as for other forms of neural networks, in particular with respect to enabling them to handle highly dimensional data on GPUs (i.e., they are typically used on data with not more than a few dozen features). Too bad, since that is exactly what I needed for a project: fast SOM training on data with thousands of features. I had tried existing libraries, including those based on PyTorch, and was not quite satisfied, so I made my own: ksom (admittedly also because it is fun to do, especially as a way to get better at using PyTorch).

Recent Articles

Related Stories

Leave A Reply

Please enter your comment!
Please enter your name here