Alexander Kolesnikov

Polygonal approximation

  1. Introduction
  2. Heuristic algorithms
  3. Optimal algorithms for min-e problem
  4. Iterative reduced search
  5. Optimal algorithms for min-# problem
  6. Approximation of closed contours
  7. Multiple-object polygonal approximation
  8. References
  9. List of publications

3.1 Introduction

3.1.1 Problem formulation

The general problem of approximation a given two-dimensional piecewise linear curve by another coarser one is of fundamental importance in computer graphics, vectorization tasks, vector map processing (see Figs. 3.1.3-3.1.7).

An open N-vertex polygonal curve P in 2-dimensional space is represented as the ordered set of vertices P={p1, …, pN}={(x1, y1), …, (xNyN)}. The output coarser curve Q consists of (M+1) vertices: Q={q1, …, qM+1}, where the set of vertices qm is a subset of P and M<N. The end points of Q are the end points of P: q1 = p1, qM+= pN. The approximation linear segment (qmqm+1) of Q for curve segment {pi, …, pj} of P is defined by the end points pi and pj: qm = pi and qm+= pj. Thus, (qmqm+1) = (pipj).

As it was stated in [147, 119] there are two types of optimization problems connected with polygonal approximation problems:

Min-e problem: Given a polygonal curve P, approximate it by another polygonal curve Q with a given number of line segments M so that the approximation error E(P) is minimized.

Min-# problem: Given a polygonal curve P, approximate it by another polygonal curve Q with the minimum number of segments M so that the approximation error E(P) does not exceed a given maximum tolerance e.

3.1.2 Error measures

An approximation curve must satisfy some error criterion, which is specified appropriately for each application. In practice, the most of practical error measures in use are based on distance between vertices of the input curve and the approximation linear segments.


Figure 3.1.1: Distance dk from the point pk to the linear segment (pi, pj).

The distance dk(ij) from curve vertex pk=(xkyk) to the corresponding approximation linear segments (pipj) is defined as follows (see Fig. 3.1.1):

(3.1.1)

where the coefficients ai,j and bi,j are defined from the parameters of the linear segment (pipj):

(3.1.2)

The additive error measure Lp for curve segment {pipj} is defined by the sum of distances d(kij) for all vertices in the segment as follows:

(3.1.3)

For L¥ the approximation error i s defined as the maximum deviation of input curves from approximation linear segment:

(3.1.4)

Approximation error Ep(P) of the input curve P by Q with additive error measure Lp (where p<¥) is defined as the sum of approximation errors for all segments:

(3.1.5)

For error measure L¥ , the approximation error for the curve P is defined as the maximum of distances for all segments:

(3.1.6)

The error measure L2 (integral square error, or ISE) is perhaps among the most widely used criteria for approximation min-eproblem. The error e2(pi, pj) with measure L2 can be calculated in O(1) time using stored arrays of coordinates cumulatives.

a) b)
Figure 3.1.2: Error criteria with measure L¥: the parallel-strip (or infinite beam) criterion (a) , and the segment distance (or tolerance zone) criterion (b).

In most cases the error measure L¥ is used in algorithms for min-# problem for practical reasons. In [119] different error criteria with measure L¥ are given, including the most popular parallel-strip (or infinite beam) criterion [269], that is maximum distance between the line connecting pi and pj and the points of the curve segment {pi, …, pj} (see Fig. 3.1.2, left). The segment distance (or tolerance zone) criterion is defined as maximum distance between the line segment (pi, pj) and the points of the curve segment {pi, …, pj} (see Fig. 3.1.2, right). In some algorithms, other error measures are in use for min-# problem: integral square error L2 [233, 243], and local integral square error (LISE) [52], or more complicated objective functions [212]. However, using error measure L2 for min-# problem is not efficient in practice, especially in the case of multiple objects, because of its additive nature.

Since optimal algorithms are computationally expensive to be used (usually between O(N2) and O(N3)), faster sub-optimal algorithms have been developed, often running in linear time. In order to evaluate the quality of sub-optimal algorithms, Rosin [224] introduced two measures. Fidelity (F) measures how well the suboptimal polygon fits the curve relative to the optimal polygon in terms of the approximation error. Efficiency measures how compact the suboptimal polygonal presentation of the curve is. They are defined as follows:

where M is the number of segments for an algorithm under question, Mmin is the number of segments for min-# solution, E is approximation error for an approximation algorithm, and Emin is the approximation error of the optimal solution for the min-e problem.

3.1.3 Motivation

Important problems such as polygonal approximation have been explored very intensively for the last 30 years. The min-# and min-e problems can be solved as an optimization task by dynamic programming or methods based on graph theory. Moreover, many heuristic algorithms have proposed. Optimal algorithms of complexity O(N2)- O(N3) are very slow, whereas faster heuristic algorithms lack of optimality. Thus, development of efficient (fast and optimal or near-optimal) algorithms for large input data is still an open problem.

Introducing into practice more efficient algorithms for min-# problem we can reduce storage demands or transmission time for the same tolerance level. With more efficient algorithms for min-e problem, we can reduce approximation error for the same amount of stored or transmitted data. Taking into account that nowadays polygonal approximation in vectorization, map service, CAD and GIS applications, the efficient solution of this problem is still of great practical importance (see Figs. 3.1.3-3.1.7).

At first we provide a short survey of heuristic algorithms for min-# and min-e approximation problems in Section 3.2. Then we explore optimal algorithms for min-e problems in Section 3.3. To bridge the gap between slow optimal and fast heuristic non-optimal algorithms we introduce paradigms of bounding corridor and iterated reduced search [P4]. Then we study optimal algorithms for min-# problem and present algorithm with joint using of different error measures based on the reduced search [P5]. Furthermore, we consider min-e and min-# problems for the case of closed contours and provide cyclical DP algorithm with analysis of the state space [P6]. Finally, we extend the proposed iterative reduced search approach to the case of min-e problem for multiple objects [P7].

a) b)
Figure 3.1.3: Min-# approximation of digitized curves (skeletons) for raster-to-vector conversion: input raster binary image (a); vector image for error tolerance e=1.5 (b). Vectors are labeled by dots.

a) b)
Figure 3.1.4: Min-# approximation of digitized curves: segmentation data (a); result of approximation for error tolerance e =1.5 (b).

a) b)
Figure 3.1.5: Min-e approximation of multiple-object vector data: vector map of "Europe"; 365 objects with 160,000 vertices (a); fragment of the vector map after 20:1 data reduction (b).

Figure 3.1.6: Min-e approximation of closed 2900-vertex contour "Australia" by M=18 linear segments.


Figure 3.1.7: Min-e approximation of 5000-vertex digitized path in 3-D space by M=100 linear segments.


Copyright (c) 2003, All rights reserved, Alexander Kolesnikov