I've recently been researching the Hierarchical density-based spatial clustering of applications with noise (HDBSCAN) algorithm. It's an extension of the methods of the very popular DBSCAN algorithm.

It was published at a conference with a paper titled "Density-Based Clustering Based on Hierarchical Density Estimates".

The readthedocs.io page does a great job of explaining how the algorithm works.

## Performance

It is able to extract clusters from noisy data.

The figures on the right show how well the algorithm works on a difficult clustering problem. Other algorithms are typically not as effective.

## Applications

There are numerous machine learning applications for HDBSCAN, and the following gives a rough idea of what this algorithm can be used for.

HDBSCAN is used for the detection of anomalies in various domains, including financial accounting.

One of the newer applications of HDBSCAN is the De-anonymizing of the bitcoin blockchain.

The paper by Bharath Srivatsan demonstrates how the HDBSCAN method was applied to trace the theft of 50BTC.

A figure from the paper linked:

There are many new anonymizing techniques today, such as layer 2 payments (LN!), coinjoin, or other techniques used in some blockchains. I think that LN would defeat this method of de-anonymization, but I'm not sure if coinjoin would.