Alexander Kolesnikov

Polygonal approximation

  1. Introduction
  2. Heuristic algorithms
  3. Optimal algorithms for min-e problem
  4. Iterative reduced search
  5. Optimal algorithms for min-# problem
  6. Approximation of closed contours
  7. Multiple-object polygonal approximation
  8. References
  9. List of publications

3.2 Heuristic algorithms

3.2.1 Survey of solutions

For the last 30 years many heuristic algorithms have been considered for approximation of polygonal curves. We can account about dozen of different heuristic approaches to the problem, and the number of algorithms exceeded one hundred items. In some extent, the existence of big amount and variety of the heuristic approximation algorithms can be explained by the variety o f tasks, curves types, and error measures in use. But more likely the real reason for existence of numerous approximation methods is low fidelity and/or efficiency of the proposed heuristic algorithms [224, 225], which leaves room for improvement.

Let us briefly consider the following approaches proposed for solving approximation problems. Some of the presented methods can be used for solving both problems, but some of the algorithms are designed for min-# or min-e problem only. With algorithm for one of the problems (min-# or min-e) we can get solution for alternative problem in O(logN) steps of binary search [37].

a) Sequential tracing approach

The algorithms [249, 216, 290, 250, 147, 94, 121, 284, 218, 54, 12, 213, 143, 316] use a linear scan to evaluate error conditions, if the conditions are not satisfied a new segment search is started. The main problem of the methods is that the nodes sometimes do not correspond to the corners of the curve because a new vector is defined only when the criterion is violated.

b) Split method

The most widely used high-quality approximation algorithm is a heuristic method called the Douglas-Peucker algorithm [71]. It has been independently invented by many people [207, 72, 24, 21]. The iterative procedure repeatedly splits the curve into smaller and smaller curves until the maximum of the perpendicular distances of the points on the curve from the line segment is smaller than the error tolerance e. The complexity of the method is O(N2) in the worst case, and O(NlogN) on average. The algorithm works in any dimension since it only depends on computing the distance between points and lines. The main disadvantage of this approach is the dependency on the starting point. It also suffers from stressing outliers, see more critics in [279, 280]. Modified version of Douglas-Peucker algorithm was proposed [104, 105] complexity of O(N logN) in the worst case, based on construction of convex hull of 2D point set. Later the result has been improved to O(N log*N) [106]. Unfortunately, the faster algorithm is not general, as it only works with simple 2-D planar curves, and not in higher dimensions.

c) Merge method

A common idea in some algorithms such as Sequential Tracing, Split, Dominant point detection, is to choose curve points to be vertices of the polygonal approximation. It can be done in opposite direction by choosing, at each stage, a curve point that will not be a vertex [157, 78, 294, 33, 281, 146, 198, 112, 150]. A reasonable choice is a point of which elimination will cause minimal increase in the approximation error. The procedure is halted when the desired number of linear segments M, or approximation error, is reached. Result is independent on the starting point. The complexity of the algorithm by Pikaz and Dinstein [198] is O(N logN).

d) Split-and-Merge method

According to the technique, lines are fitted to an initial segmentation of the boundary and the least squares is computed. The procedure then iteratively splits a line if the error is too large and merges two points if the error is too small [192, 10, 294, 189, 103, 215, 300]. This combines the split and merge methods with the same algorithm.

e) Dominant point detection

Attneave [18] indicates that most shape information is contained in the corners (high curvature points), which are able to characterize the contour. To approximate curves using straight lines, high curvature points are the best place at which to break the lines. A large number of heuristic algorithms have been designed on the basis of the idea [220, 223, 86, 59, 236, 35, 9, 284, 17, 79, 263, 62, 10, 76, 299, 210, 211, 237, 296, 14, 319, 56, 115, 89, 120, 234, 235, 251, 23, 173, 293, 162].

f) Relaxation labeling

According to the approach, the left and right slopes and curvature value is measured at every point of the input digital curve. Each point on the curve is associated with an attribute list containing slopes and curvature at the point. The attributes will determine the initial probability of the point pi being a ‘side, and probability of being an ‘angle’. The relaxation process will change the probabilities. As the relaxation process is iterated, certain points became more certain that they are ‘angles’, while other points became more certain that they are ‘side’. Finally, the probabilities converge to some values [60, 61, 227, 166].

g) K-means method

Phillips and Rosenfeld [196] proposed k-means based clustering method to partition the contours into subsets of points such that each subset can be fitted by a straight line. They started with an initial partition and then evaluate the principal axis. In [304] three algorithms have been proposed based on Phillips and Rosenfeld approach for min-e problem with two error measures (L2 and L¥ ). A simpler line fitting method was used instead of the principal axis approach.

i) Genetic (evolutional) algorithms

Genetic algorithms [305-307, 116, 215, 63, 259, 270, 108, 312] are based on stochastic search, which simulates the biological model of evolution [93]. A population is set of chromosomes and an initial one is randomly generated. During each generation, the fitness (approximation error) of each chromosome is evaluated, and chromosomes are selected for reproduction based on their fitness values. Selection operator reproduces some chromosomes with smaller approximation error and eliminates chromosomes with bad approximation. The selected solutions the undergo reproduction under action of the crossover and mutation operations.

j) Ant colony optimization method

Ant Colony Optimization (ACO) is an optimization paradigm that mimics the exploration strategy of a colony of ants [67]: ants can construct the shortest path from their colony to the feeding source and back using pheromone trails. An ant leaves some quantities of pheromone on the ground and marks the path by a trail of this substances. The next ant choose path depending on the amount of pheromone on it and leaves its own pheromone. Vallone [277] considered min-e problem with maximum perimeter of polygon as an optimization criteria. Yin [308] considered min-# problem with error metrics L2, including approximation of closed contours.

k) Tabu search

Tabu search, developed by Glover [91], is one of the meta-heuristic methods that can be used to solve combinatorial optimization problems. It is different from the local search in the sense that tabu search allows moves to a new solution, which makes the objective function worse in hope that it will achieve a better solution in a longer term [Yin’00, Zhang-Guo’01].

l) Vertex adjustment method

The main idea of the vertex adjustment method is to improve preliminary result by a local search [44, 132, 153, 49, 311, 110, 173]. The local search can be applied to vertices successively (vertex-by-vertex) [132, 49], or simultaneously to all vertices using optimization technique [44, 153, 110]. Solution of the optimization task can be found by Viterbi algorithm for the shortest path in a graph [153] or by dynamic programming [44, 110, 173].

In [153], the adjustment of approximation points is performed by Viterbi algorithm for the shortest path in graph. The search is performed among 4-neigbours of the initial approximation points. This technique, however, can be extended to the case where approximation points belong the input curve only.

Chen et al. [44] studied min-eproblem for approximation by circular arcs and line segments they introduced modification to the dynamic programming algorithm for the problem in question. To reduce processing time, they search all possible combinations of the points that are within a given range with respect to an initial set of detected break points, instead of performing complete enumeration. The initial set of break points, {ui, i=1, 2, …, N}, is the solution obtained with heuristic algorithm for dominant (break) point detection [44]. The possible sets of solutions are generated by varying each ui within a range of (uih, ui+h), where h=3. Although this method has been proposed for approximation by circular arcs and lines, the approach can be applied in the case of piecewise linear approximation as well.

In [153] the adjustment of approximation points is performed by Viterbi algorithm for the shortest path in graph. The search is performed among 4-neigbours of the initial approximation points. However this technique can be extended to the case when approximation points belong to the input curve only.

Neumann and Teisseron [173] and Horng [110] used the same approach as Chen et al. [44] to reduce the processing time of dynamic programming algorithm. They performed search of the optimal location of approximation vertices within given range around the current position. Neumann and Teisseron defined the window where the dominant point can be located. Horng defined the window size as 1/3 of the number of the vertices in the curve segment between two correspondent dominant points. In both cases, the set of the initial approximation points {q1, …, qM+1} is defined by algorithms for the dominant points detection.

3.2.2 Summary

The presented heuristic approaches can be divided into two classes from optimality point of view:

  1. Classical algorithms, namely sequential algorithms, split, merge, split-and-merge, dominant points detection, relaxation labeling;
  2. Optimization algorithms: K-means, tabu search, genetic algorithms, ant colony optimization methods, and vertex adjustment methods.

The classical algorithms are mostly based on heuristic methods or approaches. The fidelity of these algorithms is usually low [224] because the global optimization error is not subject to control during the process of the approximation. Some heuristic algorithms can be used as part of optimal algorithms. For example, the cone-intersection method was used in graph theory based algorithms to reduce the complexity of the graph construction procedure. Sallotti [230, 231] applied heuristic (Split) algorithm of Pavlidis [190] to estimate upper bound for approximation error to reduce A*-search in graph.

In optimization algorithms the approximation problem is considered as optimization task where the global approximation error is the main criterion to be controlled. The search of solution that provides minimal approximation error can be performed by stochastic optimization methods (as genetic algorithms and ant colony method) or by local optimization methods (as tabu search and vertex adjustment methods). The initial solution for starting the search can be obtained with some heuristic algorithm for approximation, or any random approximation can be used. Then the initial approximation (or approximations) is improved to find minimum of the global approximation error . Algorithms of this class can provide near-optimal or sometimes optimal results , but the global optimality cannot be guaranteed even in the case of iterative approaches.


Copyright (c) 2003, All rights reserved, Alexander Kolesnikov