One of the common demonstrations for k-means clustering is as a pre-processing step for image segmentation, or as an automatic way to perform color quantization.
The natural next step seemed like it would involve using the same techniques on video. I did not find anything online which currently demonstrated how to do this this, so I made my own.
We can start by converting a gif to a series of images, running k-means on each of them, then converting the results back into a gif.
The k-means algorithm starts by randomly picking a fixed number of points as centroids. But when the images are stitched back together into an animation, this has an annoying side-effect where the images appear to flicker.
There’s a fairly easy improvement we can make to reduce this. Instead of starting from random points on each image: start the first image randomly, then seed subsequent iterations with the clusters found in the previous image.
Since the position of items in the frames tend to change more than the items themselves, this makes for a good approximation for where clusters may also be at the next time step. Colors which are similar between frames should remain roughly where they are, whereas less common colors may be updated accordingly.
Making this adjustment resolves the problem, and makes the animation much easier to watch.
I thought it would also be interesting to try different distance metrics (e.g. Manhattan Distance), but in this situation using Manhattan distance does not appear to produce substantially different results.
Gandalf in Green
Let’s see a different wizard. Gandalf the Grey is unusually green here, likely due to the prolific foliage in the background which got mixed with the foreground somewhere between initialization and convergence. By the time that ten centers are used, some of the richer shades of violet start to appear alongside the green.
Thoughts on Transferring Color Palettes
Now that we’ve seen some methods to perform color quantization in an unsupervised way, we might also wonder how easily we can transfer the clusters from one image to another.
One idea might be to run clustering on a target image to find a set of centers, determine their relative frequency in the original image, then use the colors and their relative frequencies to swap out the color palette in a new image (or gif).
This turns out to be a harder problem.
Just because two colors appear with similar relative frequencies in two different images, it does not mean that those shades will make sense in another image. However, this leaves an open problem to try out in the future.