Clustering datasets
| Image data | |||
![]() Bridge (256x256) |
![]() 4096 vectors, 16-d |
4x4 pixel blocks
ts
txt 4x4 binarized pixel blocks ts txt 4x4 pixel blocks: 25% randomly sampled (for training) ts txt 4x4 pixel blocks: 75% randomly sampled (for testing) ts txt |
|
![]() House (256x256) |
![]() 34112 vectors, 3-d |
RGB-values, quantized to 5 bits per color
ts
txt RGB-values, 8 bits per color ts txt |
|
![]() Miss America (360x288) |
![]() 6480 vectors, 16-d |
4x4 pixel blocks from the difference image of frame 1 and 2
ts
txt 4x4 pixel blocks from the difference image of frame 2 and 3 ts txt |
|
| Birch-sets | |||
![]() Birch1 |
![]() Birch2 |
Synthetic 2-d data with 100 000 vectors and 100 clusters. Zhang et al., "BIRCH: A new data clustering algorithm and its applications", Data Mining and Knowledge Discovery, 1 (2), 141-182, 1997. |
|
![]() Birch3 |
Birch1: Clusters in regular grid structure ts txt Birch2: Clusters at a sine curve ts txt Birch3: Random sized clusters in random locations ts txt |
||
| S-sets | |||
![]() S1 ![]() S3 |
![]() S2 ![]() S4 |
Synthetic 2-d data with 5000 vectors and 15 Gaussian clusters with different degree of cluster overlapping. P. Fränti and O. Virmajoki, "Iterative shrinking method for clustering problems", Pattern Recognition, 39 (5), 761-765, May 2006. S1: ts txt S2: ts txt S3: ts txt S4: ts txt Source and labels: zip |
|
| A-sets | |||
![]() A1 3000 vectors, 20 clusters |
![]() A2 5250 vectors, 35 clusters |
Synthetic 2-d data with varying number of clusters and vectors. A1: ts txt A2: ts txt A3: ts txt |
|
![]() A3 7500 vectors, 50 clusters |
|||
| Dim-sets | |||
![]() Dim2 |
Synthetic data with Gaussian clusters in multi-dimensional space. 1351-10126 vectors, 2-d - 15-d ts txt |
||
| DIM-sets (other) | |||
![]() DIM032 1024 vectors, 16 clusters 32 dimensions |
![]() DIM064 1024 vectors, 16 clusters 64 dimensions |
Dim-sets. DIM032: ts txt DIM064: ts txt DIM128: ts txt DIM256: ts txt DIM512: ts txt DIM1024: ts txt |
|
![]() DIM128 1024 vectors, 16 clusters 128 dimensions |
![]() DIM256 1024 vectors, 16 clusters 256 dimensions |
||
![]() DIM512 1024 vectors, 16 clusters 512 dimensions |
![]() DIM1024 1024 vectors, 16 clusters 1024 dimensions |
||
| KDDCUP04Bio set | |||
![]() KDDCUP04Bio 145751 vectors, 2000 clusters 74 dimensions |
KDDCUP04Bio biology dataset. KDDCUP04Bio: ts txt |
||
| Thyroid set | |||
![]() Thyroid 215 vectors, 2 clusters 5 dimensions |
Thyroid dataset. Thyroid: ts txt |
||
| Wine set | |||
![]() Wine 178 vectors, 3 clusters 13 dimensions |
Wine dataset. Wine: ts txt |
||
| Yeast set | |||
![]() Yeast 1484 vectors, 10 clusters 8 dimensions |
Yeast dataset. Yeast: txt Yeast_times100: ts txt |
||
| Breast-cancer-Wisconsin set | |||
![]() Breast 699 vectors, 2 clusters 9 dimensions |
Breast-cancer-Wisconsin dataset. Breast: ts txt |
||
| g2 sets | |||
![]() g2-2-30 1024 vectors per cluster, 2 clusters 1-1024 dimensions variance 10-100 |
Gaussian clusters dataset. g2: ts's in zip file (53MB) |
||
![[bridge.pgm]](bridge.png)

![[house.ppm]](house.png)

![[missa001.pgm]](missa.png)























