Numerical Algorithms 5 (1993) 325-337
325
Nonlinear Chebyshev fitting from the solution of ordinary differential equations Jack Williams* and Z. Kalogiratou Department of Mathematics, Universityof Manchester, Manchester, M13 9PL, England
The nonlinear Chebyshev approximation of real-valued data is considered where the approximating functions are generated from the solution of parameter dependent initial value problems in ordinary differential equations. A theory for this process applied to the approximation of continuous functions on a continuum is developed by the authors in [17]. This is briefly described and extended to approximation on a discrete set. A much simplified proof of the local Haar condition is given. Some algorithmic details are described along with numerical examples of best approximations computed by the Exchange algorithm and a Gauss-Newton type method. Keywords: Chebyshev approximation, differential equations, initial value problems, exchange algorithm. AMS(MOS) subject classifications: primary 41A50, 65D 10, 65L05.
1. Introduction Developments in the theory and practice of nonlinear Chebyshev approximation [3,5,10,11,13] have in practice resulted in essentially only two useful classes of approximating functions being in general use, namely, rationals and exponential sums. Both can be invaluable for obtaining explicit approximations to continuous functions and reasonably accurate data sets (i.e. no wiIdpoints). In attempting to extend the useful classes of nonlinear approximating functions the authors in [17] consider best Chebyshev approximation by solutions of parameter dependent initial value problems in ordinary differential equations (ODEs). The approximations are implicitly defined and in general may only be evaluated by some auxiliary numerical scheme for solving the differential equation, for example, a Runge-Kutta scheme. This aspect is not discussed here, since the form of this recovery scheme is application dependent, for example, depending on whether the approximation is needed at a single point or at a dense set of points. The motivation for the ODE approximation process is twofold. Firstly, some * Email:
[email protected] 9 J.C. Baltzer AG, Science Publishers
326
J. Williams, Z. Kalogiratou / Nonlinear Chebyshevfitting
single species processes, for example in the biological sciences, are modelled by time dependent initial value problems. It therefore seems meaningful to approximate data arising from such processes by solutions of differential equations. The standard example here is approximation by exponential sums - regarded as a solution of linear differential equations of appropriate order. Secondly, the exponential case points the way to exploring broader classes of differential equations whose solutions may provide useful nonlinear Chebyshev approximation classes, possibly with desirable shape preserving properties such as monotonicity, convexity or special asymptotic behaviour. In [17] approximation was on [a, b] and approximation on subsets could only be justified for subsets which were sufficiently dense in [a, b] according to the discretization theory of Burke [4]. With the above objectives in mind this paper concentrates on the Chebyshev approximation of arbitrary data sets. In section 2 the Meinardus and Schwedt theory of nonlinear discrete Chebyshev approximation is described. Section 3 describes the ODE approximating functions and conditions are given on the defining differential equation which ensure that the best approximation is unique and is characterized by the Chebyshev equi-oscillation property (thus enabling the Exchange algorithm to be used for computing a best approximation). A new and much simplified proof of the local Haar condition is given. Section 4 considers particular examples of ODEs and some computational details; numerical results are given in section 5. We observe that in practice there may not be any special need to fit the ODE in the Chebyshev sense, in fact a smoothing ODE, that is, an ODE fitted in the least squares sense, may be more appropriate. The difficulty here is that for general nonlinear 12 fitting there can be several local and global minima [3,7]. For the theory described in this paper the l~ fit has the advantage of global uniqueness. However, being with respect to the stronger norm, the l~ parameters could serve as starting approximations for fitting a smoothing ODE.
2. Discrete Chebyshev approximation Let X -- {a ~
IIYllx = Tax Ir(x)l 9 Then y(p*, x) is a best Chebyshev approximation to Y on X if
I[Y(x)-y(p*,x)llx<
llY(x) -yCo, x)llx,
Vy(p,x)
v.
3". Williams,Z. Kalogiratou / Nonlinear Chebyshevfitting
327
Throughout we assume that y(p*, x) exists. The theory of Meinardus and Schwedt [10,1,4] can be used to characterise a best approximation y(p*, x). We require that V satisfies the following conditions [I 0] (the conditions may be weakened to apply on 2" only, but there is no real practical value in this). S m o o t h n e s s . Both y(p, x) and Oy(p, x)/Opi, i = 1 , 2 , . . . , n, are assumed to be continuous forp e P, x ~ [a, hi, so that for fixed x ~ X , p 9 P, q ~ P, n ~--, Oy(p , x) Y(p+tq'x)=y(p'x)+ti~=l Opi qi+o(t)
as t---~0. Local H a a r condition. For each p ~ P, the functions Oy(p, x)/Opi, i -- 1 , 2 , . . . , n, generate a Haar subspace of dimension d(p) >~1. That is, if the linear span of the functions Oy(p, x)/Opi, i = 1 , 2 , . . . , n, has dimension d(p) >f0, then every nonzero element in the span has at most d(p) - 1 zeros in [a, b]. G l o b a l H a a r condition. For each p ~ P , y ( p , x ) ~ y ( q , x ) , q E P , implies that y(p, x) - y(q, x) has at most d(p) - 1 zeros in [a, b]. With these conditions on V, theorem 85 [10, p. 134] (a general sufficient condition for a best approximation on compact JO, and theorem 89 [10, p. 140] (similarly, a necessary condition which does not require asymptotic convexity on V) can now be applied by appropriate modification of the arguments in theorem 91 [10, p. 144] (characterisation on [a, b]) to establish the following. THEOREM 2.1 The approximant y(p*, x) e V is the best approximation to Y ~ CIX ] if and only if there exists an alternant of s = d(p*) + 1 points in X given by xi ~
e(ti)=-e(ti+l),
i=l,2,...,s-1,
le(ta)l = Ile(x)llx. The local and global Haar conditions can now be used as in theorem 92 [10, p. 146] to show the uniqueness ofy(p*, x). The following De La Valle6 Poussin type result is useful in practice. LEMMA 2.1 Let V satisfy the global Haar condition and let Y(x) - y(p, x) alternate in sign on the s - - d ( p ) + 1 points t i ~ X arranged in increasing order. Then for any
y(q,x)~ V,q # p ,
328
J. Williams, Z. Kalogiratou / Nonlinear Chebyshev fitting
m a x lY(ti) - y(q, ti)[ >i m i n l Y ( t i ) - y(p, ti)l. i
t
Proof Assume the contrary. Then max [ Y(ti) - y(q, ti)l < mjn [Y(ti) - y(p, ti)l. !
I
Now [ Y ( t i ) - y ( p , ti)[ = (-)iui, i = 1 , 2 , . . . , s , for some nonzero ui of one sign. Hence y(q, x) - y(p, x) has at least d(p) zeros in [a, b], contradicting the global Haar condition [] With q = p* there follows the inclusion interval min [ Y(ti) - y(p, ti)[ ~
IIY(x) - y(p, x)llx.
This was used to monitor the convergence of the Exchange algorithm (section 5) where y(p, x) is the current approximation.
3. Chebyshev approximation by ODEs The approximating functions V are defined as the solutions y(p, x) of the initial value problem,
dy/dx=f(x,y,p), y(a) = pl , p = (Pl,P2,...
xE[a,b], ,p~) e P c I~~ .
(3.1)
The case of ordinary polynomial approximation of degree n - 1 is n --2 f = )--~],=2Pr~ . In [17] best approximation from V on [a, b] is treated and sufficient conditions onf(x, y,p) are given which ensure that V satisfies the smoothness and local and global Haar conditions of section 1. THEOREM 3. I [17] Let the initial value problem (3.1) satisfy the following conditions. (1) P is an open subset of I~n and for p ~ P a unique solution y(p, x) exists which is sufficiently differentiable with respect to x. (2) Forp ~ P, x ~ [a, b], the variational equations for i = 1 , 2 , . . . , n,
d (.cqy(p,x).'~ O f ( x , y , p ) Oy(p,x) Of(x,y,p) ~c k, cgpi ,] = cgy cgpi b cgpi , with the initial conditions
J. Williams, Z. Kalogiratou / Nonlinear Chebyshev fitting
Oy(p,a) apl = 1 ,
Oy(p,a) ap, = 0 ,
329
i•l,
have a unique solution which is sufficiently differentiable on [a, b]. (3) For p 9 P, x e [a, b], af /apl = 0 and U(p) the space spanned by the functions 0f(x, y(p, x),p)/api, i = 2,..., n, is a Haar space of dimension d(p) - 1. (4) For p, q 9 P,f(x, y(p, x), q) 9 C[a, b], and for p 9 P, there exist at most d(p) - 2 distinct points xj 9 [a, b] such that f (xj, y(p, x j ) , p ) -- f (xj, y(p, xj), q) = O,
for q 9 P, where double interior zeros are counted twice. Then the approximating functions V satisfy the local and global Haar conditions with dimension d(/7). In the proof, given conditions 1 and 2, condition 3 implies the local Haar condition and condition 4 the global Haar condition. 3.1. T H E L O C A L H A A R C O N D I T I O N
In [17] the proof of the local Haar condition involves a close examination of simple and double zeros; a more straightforward proof is now given with slightly relaxed conditions. L E M M A 3.1.
Let the solution of the variational equations be unique and continuous on [a, b]. Let condition 3 of theorem 3.1 be satisfied. Then V satisfies the local Haar condition.
Proof Forp 9 P, let W ( p ) = span{OY~piX),i=l,2,....
U(p) = span{ Of(x'y(p'x)'p),i = 2,3, ~Pi
,n},
. . . ~n
}
.
From condition 3, U(p) has dimension d(p) - 1, from which it can be shown that W(p) has dimension d(p) [17, theorem 3.1]. It remains to be shown that, for p 9 P, any element (not identically zero) in W(p) given by the continuous function "
L
,x) =
i=1
Oy(p,x)
op---?-,
has at most d(p) - 1 zeros in [a, b]. Now forp 9 P timed, let, for x 9 [a, b],
330
J.
Williams,Z. Kalogiratou/ NonlinearChebyshevfitting
c( t) = Of(t, y(p, t),p)
Oy
g(x)=ex P(faX
c ( t) d t ) ,
h(x) =exp(-faXc(t)dt). Applying an integrating factor to the variational equations gives
Oy(p, x) _ g(x) , Opl
Oy(p,X)opi - g(x) fa xh(r) Of('c'y(p"r)'P)
i = 2,3,...,n.
Defining the continuous function n
M(p,xl=
Of(x,y(p,x),p) 0p i
i=2
and forming linear combinations of these equations yields the explicit expression
L~,x) = g(x)(al + faXh(T)M(P, T)d'c) 9 Assume now that L(p, x) has k >~d(p) zeros in [a, b],
a<~xl < x 2 <
.
.
.
f
0,j = 1 , 2 , . . . , k - 1, x/+1 h(~)M(p, ~)d~ = 0,
Ja
Ce1 q-
h(z)M(p, ~-)dT =
h(T)M(p,-r)d-r +
0,
d xj
L(~A/g(+) + f xj+l hO-)M(p,~-)d~-
= 0
J xj
Hence
f
++' h(~)M(p, ~)d~ -- 0,
and since h(~-)~0, for "r~(xj, xj+l), there exists ~(xj, xj+l) such that M(p, "0) = 0. Therefore M(p, x) has k - 1 1>d(p) - 1 distinct zeros in [a, b]; this contradicts condition 3. []
J. Williams, Z. Kalogiratou / Nonlinear Chebyshevfitting
331
For V satisfying the local and global Haar conditions means from theorem 2.1 that the Exchange algorithm can be used to compute the best approximation. However, the local Haar condition is important in its own right, since it plays an important role in other methods, for example the more generally applicable GaussNewton type methods (trust region methods) and Madsen's method [9]. In this approach the discrete Chebyshev problem is treated as the l~ solution of an overdetermined system of nonlinear equations, min IIr(p)[[oo,
r:
P--~N
ri(p) = Y(xi) - y(p, xi),
i= l,2,...,N.
These methods can only be expected to converge to a local minimum. Their convergence behaviour is dependent on the solution of local linear subproblems [16, p. 214], whose N x n matrices are given by the Jacobian of the mapping r, namely,
\
Opi ]
If y(p, x) e V is such that d(p) = n, then for distinct xi e [a, b] and V satisfying the local Haar condition (according to lemma 3.1), it follows that the matrix J(p) satisfies the Haar condition. That is, every n x n submatrix of J(p) is nonsingular. This implies that the linear subproblems have unique solutions and the methods are locally quadratically convergent [16]. A reduction in the dimension of the space W(p) in a neighbourhood of a stationary pointp* degrades the convergence rate to linear, and the methods can converge very slowly.
4. Examples of approximating ODEs In [17], three forms of ODEs (3.1) are defined which, subject to certain conditions, satisfy theorem 3.1. The derivative forms are: n
f ( x , y , p ) = X(x) + ~_~Pi~i(Y), i=2
Pa + P3_____Y+Y ... + Pr+2Yr f ( x , y , p ) = X(x) + w(y) 1 +p,+3y -+ ?. ~~ Y ' f (x, y,p) = X(x) + w(y) Z p i + l exp(pi+r+lY) . i=1
The functions X(X), and w(y) are specified in each case. The key condition for each ODE is that forp e P, x e [a, b], the solutions y(p, x) are one-to-one. In practice this essentially restricts the use to the approximation of monotonic data sets. Full
332
J. Williams,Z. Kalogiratou / Nonlinear Chebyshevfitting
details are in [17], along with each value of d(p) which describes the length d(p) + 1 of the alternant in theorem 2.1; numerical examples on dense subsets are also given. Here we consider further the first and computationally simplest class of ODEs, n
d y = X(x) + EPiOi(y),
dx
xe[a,b]
i=2
y(a) = P l .
(4.1)
From theorem 3.1, in addition to the solution y(p, x) being one-to-one, we need that for z e R, the range of y(p, x), the functions ~b2(z), ~b3(z), are continuously differentiable and satisfy the Haar condition with dimension d ( p ) = n - 1; then d(p) = n. Hence the best approximation is characterised by an alternant of known length, n + 1 (for each of the above rational and exponential forms off(x, y,p) the value of d(p) is dependent on the actualpararneter values p). The Haar condition is satisfied with ~i(z) = z i - 2 . 4.1. C O N S T R A I N E D A P P R O X I M A T I O N
By reparameterising (4.1) the signs of dy/dx and d2y/dx 2 can be controlled. Rewriting as
dy__ X(x) + Ep2idpi(y), n
dx
x~[a,b]
i=2
y(a) =Pl , (4.2) then
f Of(x,y(p,x),p) } U(p) = span~ x-,i = 2,3,... ,n = span{2pifbi(y),i= 2 , 3 , . . . ,n}. t. opi In practice the simplest case arises for P restricted to pi ~ 0, i = 2, 3 , . . . , n. Again, condition 3 of theorem 3.1 is then satisfied with dimension n - 1; condition 4 follows easily. Now, let X(X) - O. Then d2y
dx 2 -
Of(x,y,p) r, , -~y J tx, y,P) ,
Of(x,y,p)
~
2 d~bi(y)
Now, for pi y~ O, c~i(y) ~ 0, i = 2, 3 , . . . , n, a solution of (4.2) is one-to-one. Two cases of practical interest are as follows:
J. Williams, Z. Kalogiratou / Nonlinear Chebyshevfitting
333
(1) For y(x) >0, xE [a, b], dpi(y) - yi-2, i = 2, 3 , . . . , n, d y / d x > O, d2y/dx 2 > O. (2) Fory(x) >0, x e [a, b], c~i(y) - _yi-2, i = 2, 3 , . . . , n, ::~ d y / d x < O, d2y/dx 2 > O.
5. N u m e r i c a l
examples
We present three numerical examples. The first two relate to fitting data by solutions of (4.1) in the case P1
-dy ~ = Epiyi_2,
xe[a,b]
i=2
y(a) = Pl,
(5.1)
for which it is known that for one-to-one solutions, the unique best approximation is characterised by an alternant ofn + 1 points. Two methods were used, the multiple point Exchange algorithm [10,1] and the trust region method of Madsen [9] (conveniently available in the Harwell code VG02A [12]). For the more generally applicable Madsen's method, no assumption is necessary concerning the alternant and so it serves as a method for checking results. However, the fact that the local Haar condition is satisfied is important (see section 3.1). The ODEs were solved using the general purpose code LSODE, available in Fortran source form from [8]. In the following examples the same starting values pl for the parameters were used in the Exchange algorithm and Madsen's method. BRA E S S ' D A T A
We approximate the reactor experiment data treated by Braess [2], where
X = {xi = i - 1,i = 1 , 2 , . . . , 16}; {lOr(xi)} = {860,603,491,407,341,288,242,205,174, 148,128, 109,96,82,71,62}. In the kth iteration of the Exchange algorithm the levelling equations
Y(~)-y(p,~)=(-1)iA,
i= l,2,...,n+
l,
are solved for p and i where X k = {a~<~ < ~ < . . . ~ + 1 ~
334
J. Williams, Z. Kalogiratou / Nonlinear Chebyshevfitting
/'/=3,
p~ = (80, 1 , - 1 ) ,
n=4,
p~ = ( 8 2 , - 2 , 0 . 1 , - 0 . 0 1 ) ,
n~5,
pl = ( 8 5 , 2 , - 0 . 5 , 0 . 0 1 1 , - 0 . 0 0 0 1 2 ) ,
x ~ = (0,3,8,15),
2
= 5.
x 1 = (0,2,5,11,15),
2 = 1.
X I = (0, 2, 5, 8,12,15),
~1 = 0.5.
Results are given in table 1. The n u m b e r of Exchange algorithm iterations was 3 for n = 3, 5 and 4 for n = 5. F o r n = 3 a plot of the fit and error curve are given in fig. 1. FRITSCH-CARLSON DATA This provides an example of m o n o t o n i c fitting as used by D e l b o u r g o a n d G r e g o r y [6]. X = {7.99, 8.09, 8.19, 8.7, 9.2, 10, 12, 15, 20} ; { Y(x;)} = {0, 2.76E - 5, 4.37E - 2, 0.169183, 0.469428, 0.943740, 0.998636, 0.999919, 0.999994}. F o r n = 4, and w i t h p I = ( 0 . 0 1 , 0 . 1 , 4 , - 4 ) , X 1 = (7.99,8.19,9.2, 12,20), ,k1 = 0.1, the best a p p r o x i m a t i o n has p a r a m e t e r values p * = (0.0080041,0.0485168, 2.9404610,-2.9436238), with error =0.0163041 and with extremal set X* = (8.09, 8.19, 9.2, 10, 12). The fit is s h o w n in fig. 2. The third example relates to the general f o r m (4.1). BELLMAN-VARAH DATA [15] Here, the fitting is f r o m the f o r m --~-= p2(126.2 - y)(91.9 - y)2
_p3y2,
xe[1,40],
y ( 1 ) -- p l ,
for the d a t a
Table 1 Results for Braess' data. n
3
4
5
p~ p~ p~ p~ p~
82.6496386 2.5800333 -0.2969065
84.7275841 -1.7777888 0.0450479 -0.0047177
85.6094186 1.7195240 -0.4064946 0.0101351 -0.0001338
1.2724159
0.3905814
error
3.3503614
J. Williams, Z. Kalogiratou / Nonlinear Chebyshev fitting
335
4
90
70
t
3 2
60 1 50 0 40 -1 30 -2
20
-3
1O
0
i
i
5
10
0
15
5
I0
15
Fig. 1. Braess' data.
X = {1,2, 3, 4, 5, 6, 7, 8, 10, 12, 15, 20, 25, 30;40} ; ( Y(xi)} --- (0, 1.4, 6.3, 10.4, 14.2, 17.6, 21.4, 23, 27, 30.4, 34.4, 38.8, 41.6, 43.5, 45.3} . The Exchange algorithm and Madsen's method gave identical results, p* = (-1.7004409,0.4605147E - 5,0.2403066E - 3), X* = (1,2,7,40), with 4 points of equi-oscillation and error 1.7004. In the Exchange algorithm 1.2
1
o
0.8
0.6
FRITSCH CARLSON DATA
0.4
I).2
J
L
i
i
I
10
12
14
16
18
Fig. 2. Fritsch-Carlson data.
--
20
J. Williams, Z. Kalogiratou / Nonlinear Chebyshevfitting
336
pX _ (0.01,0.00001,0.002), X l = (1,4, 8,40) and ,,~1 = 2. The plot of the fit and error curve are given in fig. 3. Interestingly, the theory cannot be shown to be exactly satisfied in this case. With ~l(Y)
=
( 1 2 6 - 2 - y ) ( 9 1 . 9 - Y ) 2,
q~E(Y)= _),2,
we consider if the Haar condition is satisfied. That is, the number of zeros of the linear combination h(y) = A~bl(y) + #~b2(y), for y~R, the range of y, must not exceed one (counting interior double zeros twice). A simple analysis shows that for R = [0, 50] (which includes the range of the data), h(y) has at most one such zero, but for R = [-k, 50], k > 0 , has at most two. The computed approximation y(p*) has range [ - 1.7004, 48].
6. C o n c l u d i n g remarks Subject to the conditions of the above theory the Exchange algorithm can be used to compute approximations from ODEs (4.1). Trust region methods can also be used and sometimes may be more effective when only poor starting approximations for the parameters are available. A final computed approximation can be checked via the characterization theorem to ensure that it is the unique global best approximation and not any other form of stationary point. However, given good starting approximations, the Exchange algorithm performs well, requiring only a few iterations.
50
2 1.5
40
I
1
30 0.5 0
20
-0.5 10 -1 BELLMAN VARAH DAT, -1.5
- 100
i
i
i
I0
20
30
40
-2
0
10
Fig. 3. Bellman-Varah data.
20
30
40
J. Williams,Z. Kalogiratou / Nonlinear Chebyshevfitting
337
References [1] R.B. Barrar and H.L. Loeb, On the Remez algorithm for nonlinear families, Numer. Math. 15 (1970) 382-391. [2] D. Braess, Analysis of decay processes and approximation by exponentials, in: Optimal Estimation in Approximation Theory, eds. C.A. Micchelli and T.J. Rivlin (Plenum Press, New York, 1977) pp.259-266. [3] D. Braess, Nonlinear Approximation Theory, Springer Series in Computational Mathematics 7 (Springer, Berlin, 1986). [4] M.E. Burke, Nonlinear best approximation on discrete sets, J. Approx. Theory 16 (1976) 133-141. [5] E.W. Cheney, Introduction to Approximation Theory (McGraw-Hill, New York, 1966). [6] R. Delbourgo and J.A. Gregory, The determination of derivative parameters for a monotonic rational quadratic interpolant, IMA J. Numer. Anal. 5 (1985) 397-406. [7] I. Diener, On nonuniqueness in nonlinear/,2 approximation, JAPP 51 (1987) 54-67. [8] J. Dongarra and E. Grosse, Distribution of mathematical software via electronic mail, Commun. ACM 30 (1987) 403-407. [9] K. Madsen, An algorithm for minimax solution of overdetermined systems of nonlinear equations, J. Inst. Math. Appl. 16 (1975) 321-328. [10] G. Meinardus, Approximation of Functions: Theory and Numerical Methods (Springer, New York, 1967). [11] M.J.D. PoweU, Approximation Theory andMethods (Cambridge University Press, 1981). [12] VG02A, HarweU subroutine library, United Kingdom Atomic Energy Authority, Harwell Laboratory, Oxfordshire, England (1988). [13] J.R. Rice, The Approximation of Functions, Vol. 2: Nonlinear and Multivariate Theory (Addison-Wesley, 1969). [14] J. Spies, Uniqueness theorems for nonlinear/-2 approximation problems, Computing 11 (1973) 327-355. [15] J.M. Varah, A spline least squares method for numerical parameter estimation in differential equations, SIAM J. Sci. Statist. Comp. 3 (1982) 28-46. [16] G.A. Watson, Approximation Theory and NumericalMethoda (Wiley, 1980). [17] J. Williams and Z. Kalogiratou, Best Chebyshev approximation from families of ordinary differential equations, IMA J. Numer. Anal., 13 (1993) 383-395.