Clustering datasets

Image data
[bridge.pgm]
Bridge
(256x256)

4096 vectors, 16-d
4x4 pixel blocks  ts  txt
4x4 binarized pixel blocks  ts  txt
4x4 pixel blocks: 25% randomly sampled (for training)  ts  txt
4x4 pixel blocks: 75% randomly sampled (for testing)  ts  txt
[house.ppm]
House
(256x256)

34112 vectors, 3-d
RGB-values, quantized to 5 bits per color  ts  txt
RGB-values, 8 bits per color  ts  txt
[missa001.pgm]
Miss America
(360x288)

6480 vectors, 16-d
4x4 pixel blocks from the difference image of frame 1 and 2  ts  txt
4x4 pixel blocks from the difference image of frame 2 and 3  ts  txt
 
Birch-sets

Birch1

Birch2
Synthetic 2-d data with 100 000 vectors and 100 clusters.

Zhang et al., "BIRCH: A new data clustering algorithm and its applications", Data Mining and Knowledge Discovery, 1 (2), 141-182, 1997.

Birch3
 
Birch1: Clusters in regular grid structure  ts  txt
Birch2: Clusters at a sine curve  ts  txt
Birch3: Random sized clusters in random locations  ts  txt
 
S-sets
S1
S1
S3
S3
S2
S2
S4
S4
Synthetic 2-d data with 5000 vectors and 15 Gaussian clusters with different degree of cluster overlapping.

P. Fränti and O. Virmajoki, "Iterative shrinking method for clustering problems", Pattern Recognition, 39 (5), 761-765, May 2006.

S1:  ts  txt
S2:  ts  txt
S3:  ts  txt
S4:  ts  txt

Source and labels:  zip
 
A-sets
A1
A1
3000 vectors,
20 clusters
A2
A2
5250 vectors,
35 clusters
Synthetic 2-d data with varying number of clusters and vectors.

A1:  ts  txt
A2:  ts  txt
A3:  ts  txt
A3
A3
7500 vectors,
50 clusters
   
 
Dim-sets
  Dim2
Dim2
Synthetic data with Gaussian clusters in multi-dimensional space.
1351-10126 vectors, 2-d - 15-d

ts  txt

Related links