SCAR: spectral clustering accelerated and robustified

Published in Proceedings of the VLDB Endowment - Volume 15 - Issue 11 - July 2022, 2022

Recommended citation: Ellen Hohma*, Christian M. M. Frey*, Anna Beer*, and Thomas Seidl. 2022. SCAR: spectral clustering accelerated and robustified. Proc. VLDB Endow. 15, 11 (July 2022), 3031–3044. https://doi.org/10.14778/3551793.3551850
https://dl.acm.org/doi/10.14778/3551793.3551850

Abstract

Spectral clustering is one of the most advantageous clustering approaches. However, standard Spectral Clustering is sensitive to noisy input data and has a high runtime complexity. Tackling one of these problems often exacerbates the other. As real-world datasets are often large and compromised by noise, we need to improve both robustness and runtime at once. Thus, we propose Spectral Clustering - Accelerated and Robust (SCAR), an accelerated, robustified spectral clustering method. In an iterative approach, we achieve robustness by separating the data into two latent components: cleansed and noisy data. We accelerate the eigendecomposition - the most time-consuming step - based on the Nyström method. We compare SCAR to related recent state-of-the-art algorithms in extensive experiments. SCAR surpasses its competitors in terms of speed and clustering quality on highly noisy data.