An open N-vertex polygonal curve P in 2-dimensional space is represented as the ordered set of vertices P={p1, …, pN}={(x1, y1), …, (xN, yN)}. The output coarser curve Q consists of (M+1) vertices: Q={q1, …, qM+1}, where the set of vertices qm is a subset of P and M<N. The end points of Q are the end points of P: q1 = p1, qM+1 = pN. The approximation linear segment (qm, qm+1) of Q for curve segment {pi, …, pj} of P is defined by the end points pi and pj: qm = pi and qm+1 = pj. Thus, (qm, qm+1) = (pi, pj).
As it was stated in [147, 119] there are two types of optimization problems connected with polygonal approximation problems:
Min-e problem: Given a polygonal curve P, approximate it by another polygonal curve Q with a given number of line segments M so that the approximation error E(P) is minimized.
Min-# problem: Given a polygonal curve P, approximate it by another polygonal curve Q with the minimum number of segments M so that the approximation error E(P) does not exceed a given maximum tolerance e.
Figure 3.1.1: Distance dk from the point pk to the linear segment (pi, pj).
The distance dk(i, j) from curve vertex pk=(xk, yk) to the corresponding approximation linear segments (pi, pj) is defined as follows (see Fig. 3.1.1):
(3.1.1)
(3.1.2)
The additive error measure Lp for curve segment {pi, pj} is defined by the sum of distances d(k; i, j) for all vertices in the segment as follows:
(3.1.3)
For L¥ the approximation error i s defined as the maximum deviation of input curves from approximation linear segment:
(3.1.4)
Approximation error Ep(P) of the input curve P by Q with additive error measure Lp (where p<¥) is defined as the sum of approximation errors for all segments:
(3.1.5)
For error measure L¥ , the approximation error for the curve P is defined as the maximum of distances for all segments:
(3.1.6)
The error measure L2 (integral square error, or ISE) is perhaps among the most widely used criteria for approximation min-eproblem. The error e2(pi, pj) with measure L2 can be calculated in O(1) time using stored arrays of coordinates cumulatives.
a)
b)In most cases the error measure L¥ is used in algorithms for min-# problem for practical reasons. In [119] different error criteria with measure L¥ are given, including the most popular parallel-strip (or infinite beam) criterion [269], that is maximum distance between the line connecting pi and pj and the points of the curve segment {pi, …, pj} (see Fig. 3.1.2, left). The segment distance (or tolerance zone) criterion is defined as maximum distance between the line segment (pi, pj) and the points of the curve segment {pi, …, pj} (see Fig. 3.1.2, right). In some algorithms, other error measures are in use for min-# problem: integral square error L2 [233, 243], and local integral square error (LISE) [52], or more complicated objective functions [212]. However, using error measure L2 for min-# problem is not efficient in practice, especially in the case of multiple objects, because of its additive nature.
Since optimal algorithms are computationally expensive to be used (usually between O(N2) and O(N3)), faster sub-optimal algorithms have been developed, often running in linear time. In order to evaluate the quality of sub-optimal algorithms, Rosin [224] introduced two measures. Fidelity (F) measures how well the suboptimal polygon fits the curve relative to the optimal polygon in terms of the approximation error. Efficiency measures how compact the suboptimal polygonal presentation of the curve is. They are defined as follows:
where M is the number of segments for an algorithm under question, Mmin is the number of segments for min-# solution, E is approximation error for an approximation algorithm, and Emin is the approximation error of the optimal solution for the min-e problem.
Important problems such as polygonal approximation have been explored very intensively for the last 30 years. The min-# and min-e problems can be solved as an optimization task by dynamic programming or methods based on graph theory. Moreover, many heuristic algorithms have proposed. Optimal algorithms of complexity O(N2)- O(N3) are very slow, whereas faster heuristic algorithms lack of optimality. Thus, development of efficient (fast and optimal or near-optimal) algorithms for large input data is still an open problem.
Introducing into practice more efficient algorithms for min-# problem we can reduce storage demands or transmission time for the same tolerance level. With more efficient algorithms for min-e problem, we can reduce approximation error for the same amount of stored or transmitted data. Taking into account that nowadays polygonal approximation in vectorization, map service, CAD and GIS applications, the efficient solution of this problem is still of great practical importance (see Figs. 3.1.3-3.1.7).
At first we provide a short survey of heuristic algorithms for min-# and min-e approximation problems in Section 3.2. Then we explore optimal algorithms for min-e problems in Section 3.3. To bridge the gap between slow optimal and fast heuristic non-optimal algorithms we introduce paradigms of bounding corridor and iterated reduced search [P4]. Then we study optimal algorithms for min-# problem and present algorithm with joint using of different error measures based on the reduced search [P5]. Furthermore, we consider min-e and min-# problems for the case of closed contours and provide cyclical DP algorithm with analysis of the state space [P6]. Finally, we extend the proposed iterative reduced search approach to the case of min-e problem for multiple objects [P7].
a)
b)
a)
b)
a)
b)
Figure 3.1.5: Min-e approximation
of multiple-object vector data: vector map of "Europe"; 365 objects with 160,000 vertices (a); fragment of the vector map after 20:1 data reduction (b).

Figure 3.1.7: Min-e approximation
of 5000-vertex digitized path in 3-D space by M=100 linear segments.