I've recently been researching the Hierarchical density-based spatial clustering of applications with noise (HDBSCAN) algorithm.  It's an extension of the methods of the very popular DBSCAN algorithm.

It was published at a conference with a paper titled "Density-Based Clustering Based on Hierarchical Density Estimates".

The readthedocs.io page does a great job of explaining how the algorithm works.


It is able to extract clusters from noisy data.

_images/comparing_clustering_algorithms_6_0.pngThe figures on the right show how well the algorithm works on a difficult clustering problem.  Other algorithms are typically not as effective.


There are numerous machine learning applications for HDBSCAN, and the following gives a rough idea of what this algorithm can be used for.

_images/comparing_clustering_algorithms_31_0.pngHDBSCAN is used for the detection of anomalies in various domains, including financial accounting. 

One of the newer applications of HDBSCAN is the De-anonymizing of the bitcoin blockchain.

The paper by Bharath Srivatsan demonstrates how the HDBSCAN method was applied to trace the theft of 50BTC.

A figure from the paper linked: