文档介绍:A Tutorial on Spectral Clustering
Ulrike von Luxburg
Max Planck Institute for Biological ics
Spemannstr. 38, 72076 T¨ubingen, Germany
ulrike.******@
This article appears in Statistics puting, 17 (4), 2007.
The original publication is available at .
Abstract
In recent years, spectral clustering has e one of the most popular modern clustering
algorithms. It is simple to implement, can be solved efficiently by standard linear algebra software,
and very often outperforms traditional clustering algorithms such as the k-means algorithm. On
the first glance spectral clustering appears slightly mysterious, and it is not obvious to see why
it works at all and what it really does. The goal of this tutorial is to give some intuition on
those questions. We describe different graph Laplacians and their basic properties, present the
mon spectral clustering algorithms, and derive those algorithms from scratch by several
different approaches. Advantages and disadvantages of the different spectral clustering algorithms
are discussed.
Keywords: spectral clustering; graph Laplacian
1 Introduction
Clustering is one of the most widely used techniques for exploratory data analysis, with applications
ranging from statistics, computer science, biology to social sciences or psychology. In virtually every
scientific field dealing with empirical data, people attempt to get a first impression on their data by
trying to identify groups of “similar behavior” in their data. In this article we would like to introduce
the reader to the family of spectral clustering algorithms. Compared to the “traditional algorithms”
such as k-means or single linkage, spectral clustering has many fundamental advantages. Results ob-
tained by spectral clustering often outperform the traditional approaches, spectral clustering is very
simple to implement and can be solved efficiently by standard linear algebra methods.
This tutorial is set up as a self-contained introd