Impulsive optimal control with finite or infinite time horizon

We consider a dynamical system subjected to feedback optimal control in such a way that the evolution of the state exhibits both sudden jumps and cont...

1 downloads 37 Views 368KB Size

Download PDF

JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS: VoL 46, No. 4, AUGUST 1985

Impulsive Optimal Control with Finite or Infinite Time Horizon A. B L A Q U I E R E 1

Dedicated to G. Leitmann

Abstract. We consider a dynamical system subjected to feedback optimal control in such a way that the evolution of the state exhibits both sudden jumps and continuous changes. Previously obtained necessary conditions (Ref. 1) for such impulsive optimal feedback controls are generalized to admit the case of infinite time horizon; this generalization permits application to a wider class of problems. The results are illustrated by application to a version of the innkeeper's problem.

Key Words.

State-variable discontinuities, minimum principle, optimal control, impulsive control, optimal maintenance.

1. Introduction In Ref. !, we consider a dynamical system under the control of an agent Jo who influences the evolution of the state x = ( x l , . . . , x , ) c R" in some planning period through his choice of a feedback control s in a prescribed control set So. The control law is such that the evolution of the state exhibits both sudden jumps and continuous changes. Here, we shall extend the statement of that problem and the corresponding theorem of Ref. 1 so as to include the case where the planning period extends to infinity.

2. Problem Statement We shall suppose (i) that the state lies in some open subset X of R n, whose closure we denote by clos X, and (ii) that one of its components, say x.. is the time t. We shall denote by I an interval of time closed on the

1 Professor, U n i v e r s i t y o f Paris VII, L a b o r a t o i r e d ' A u t o m a t i q u e Thdorique, Paris, France. 431 0o22-3239/85/0800-0431504.50/0 (~) 1985 Plenum Publishing Corporation

432

JOTA: VOL. 46, NO. 4, AUGUST 1985

left, and by int I its interior. Let U~_ R al and M ~ R d~ be prescribed nonempty, open sets of points u and ~, respectively. Let Ku and K s be prescribed nonempty subsets of U and M, respectively. Let P and H be the sets of all functions defined on X with range in K , and K~,, respectively. Let A be the collection of all closed subsets of X, in the topology of R". Definition 2.1. The control set is So A A x P x II. In other words, Jo will influence the evolution of the state through his choice of a closed subset of X, say Y, and a pair of functions defined on X, say ( p ( . ) , ~ r ( . ) ) c P x F I . From now on, we shall consider the two cases below. Case 1. We shall suppose that J0 desires to steer the state from a given initial state x ° c X to a state belonging to a prescribed target set 0 COX. Case 2. We shall suppose that there is no such target set and that the planning period extends to infinity. Let f ( . ) : X x U ~ R n and g(. ) : X x M - ~ R" be prescribed functions of class C a, with f ~ (f~, • • • , f , ) ,

g A ( g l , - . . , g,),

L(x,u)=-l,

g.(x,~)=-o.

Definition 2.2.

A feedback control s = (Y, p ( . ), zr(.)) c So is admis-

sible if and only if xe Y~x+g(x,

7 r ( x ) ) e X - Y.

Let S be the set of all admissible feedback controls. Definition 2.3. A function x(. ) : I - ~ clos X, defined on some interval I C R, say I = [to, h] in Case 1 or I = [to, +co) in Case 2, is a path generated by s = ( Y, p ( . ), ~r(. )) ~ S from the initial state x°~ X if and only if: (i) X(to) = x°; (ii) x(" ) is piecewise continuous on I ; let T(1) denote the set of its discontinuity points; (iii) x( t) = x( t - O ) for t c I, t ~ to; (iv) t~ T ( I ) ~ x ( t ) ~ Y and x ( t + O ) = x ( t ) + g ( x ( t ) , ~r(x(t))); (v) for all t ~ l - T ( 1 ) , except possibly at t = h in Case 1, x ( t ) c

X-Y; (vi) x ( . ) is differentiable and ( d x ( t ) / d t ) = f ( x ( t ) , p(x(t))), for t 6 / , except on a subset of I at most denumerable..

JOTA: VOL. 46, NO. 4, AUGUST 1985

433

Definition 2.4. A feedback control s ~ S is playable at x ° e X if and only if it generates at least one path x(. ) : I ~ clos X from x ° such that, in Case 1, I is a compact interval, say I = [to, tl], and x(h) ~ O, or, in Case 2, [ = [to, +oo). A corresponding triplet {x °, s, x(. )} is also called playable. Let)Co(. ) : X × U-~ R and go(" ) : X × M -~ R be prescribed functions of class C l, and let F ( ' ) ~ ( f o ( ' ) , f ( ' ) ) and G(.)~=(go(.),g(.)). We shall suppose that the cost V(x °, s , x ( . ) ) ~ f

dt e l

fo(x(t),p(x(t))) dt+ Y. go(x(t), ~r(x(t))) teT(I)

is defined for all playable triplets {x °, s, x(. )}.

Definition 2.5. A feedback control s* e S is optimal if and only if, for all x ° e X, it is playable at x ° and

V(x °, s*, x*(. )) <- V(x °, s, x(. )), for all playable {x °, s*, x*(-)} and {x °, s, x(. )}. In view of the uniqueness of the minimum, we can define a function V*(. ) : X ~ R such that

V*(x °) = V(x °, s*, x*(. )), for all x ° e X, for all playable {x °, s*, x*(. )}.

3. Necessary Conditions for an Optimal Feedback Control Theorem 1.2 of Ref. 1 gives necessary conditions for a feedback control to be optimal in Case 1. Its proof is based on the geometric properties of a family of surfaces in augmented state space {(Xo, x): XoC R, x ~ R~}. We shall restate that theorem here in a more general form, including Case 2, as the following Theorem 3.1. We first introduce the following assumptions.

Assumption 3.1. ( Y*, p*(. ), ~r*(. )). Assumption 3.2.

There exists an optimal feedback control s*=

p * ( . ) and V*(.) are of class C 1 on X -

Y*.

Assumption 3.3. If x ~ 0 Y*, then there exists an open ball B(x)C X with center x such that p*(. ) agrees on ( X - Y*) c~ B(x) with a function, say fi(. ), which is of class C 1 on B(x).

434

JOTA: VOL. 46, NO. 4, AUGUST 1985

Assumption 3.4. The set ft(x) -A{x+ G(x,/z): ~ ~ K,} is x~-directionally convex (see Ref. 1) for all x ~ Y*.

Assumption 3.5. The matrix

M=Mo+[Og~(x, tz)/Ox~],

a,/3 =0, 1 , . . . , n,

where Mo is the (n + 1) × (n + 1) unit matrix, is defined and nonsingular for all x E Y* a n d / x = ~-*(x). Indeed, we have

Og~(x, lx)/Oxo=O,

Og,(x, tx)/Oxt3=O,

a,/3 =0, 1 , . . . , n.

Let x * ( . ) : I - clos X be a path generated by the optimal feedback control s* = ( Y*, p*(. ), it*(. )) on the interval I c R, and let it (.) : I ~ R n+l be a piecewise continuous function, with it(t)& i t ( t - 0 ) for t e I, except at the initial point of I. Let n--1

g(it, x,u) ~= '~ itj~(x, u),

(1)

c~=O n-1

~c(x, tz) ~ • it~,(tc+O)g,~(x, tz),

(2)

a=O

with it = (ito, it1,..., it,) & it (t),

t6l, tc6T(I).

We say that it(. ) corresponds to s* and x*(. ) if and only if, on any subinterval [ti, tj]CI on which x*(.) is continuous, it(. ) is a solution of the set of differential equations

}~,~= -(OH(it, x,u)/Ox~),

x=x*(t),u=p*(x),

(3)

and, at any point of discontinuity of x*(. ), say to,

,\~(t~)=it~(t~+O)+(O2e~(x,~)/Ox~),

x=x*(t~),tz=~r*(x).

(4)

Theorem 3.1. If x*(. ) : I ~ clos X is a path generated by the optimal feedback control s* = (Y*, p*(. ), ~r*(. )), satisfying Assumptions 3.2-3.5, then there exists a nonzero piecewise continuous vector function h (.) : I R n+l, corresponding to s* and x*(. ), such that: (i)

on any subinterval [t~, tj]Cl on which x*(.) is continuous, min H(A (t), x*(t), u ) = H(A (t), x*(t),p*(x*(t))); u~K~

JOTA: VOL. 46, NO. 4, AUGUST 1985 (ii)

435

at any discontinuity point, say t~, of x*(. ), rain Y~c(x*(tc), Ix)= YC~(x*(tc),¢r*(x*(tc))); ~z~ Kt~

(iii)

rain H(h(tc+O),x*(t~+O), u ) - m i n H(a(tc),x*(tc), u) u~Ku

u~K u

x=x*(tc),

/z = zr*(x) ;

this inequality becomes an equality for tce Int I; (iv)

)t0(t) = 1, for all t~ L

The proof of Theorem 3.1 would go beyond the scope of the present paper. It is based on a straightforward extension o f Theorem 1.1 of Ref. 1 to the case of an infinite time horizon. Otherwise, it is analogous to the proof of Theorem 1.2 of Ref. 1. Also, one can easily prove the following theorem. Theorem 3.2.

For x e Y* and for all/x ~ / ( ~ ( x ) , where

~?~(x)~{~: ~ c K~, x + g(x, ~ ) c

X-

Y'I,

we have

V*(x) < h(x, ~) ~-go(x, ~) + V*(x + g(x, ~)). Theorem 3.2 is a direct consequence of the definition of optimality of a feedback control and of the additivity of cost. Theorem 3.2 has the following corollary. Corollary 3.1.

For x e Y*, if/£~(x) is convex and

a2¢~(x, lx)/Olx~=O,

a = 1 , . . . , d2, for alt/x e / ~ ( x ) ,

then

V*(x)=go(x,l.~)+V*(x+g(x,~)),

for all ~ e / ~ , ( x ) .

The p r o o f of Corollary 3.1 is based on the fact that

o ~ ( x , ~ ) 1 o ~ =oh(x, ~ )1o~o, = 1. . . . , d2, for x e Y* a n d / z e / £ g ( x ) , and on the fact that, for given x e Y*,

h(x, ~ ) = const,

on/(.(x),

due to the convexity o f / ( . ( x ) .

436

JOTA: VOL. 46, NO. 4, AUGUST 1985

4. Example To illustrate the use of Theorems 3.t, 3.2 and of Corollary 3.1, we consider a verison of the innkeeper's problem discussed by Case in Ref. 2, pages 281-285. The equations of motion of the state x a_ (x~, x2), with x2 ~= t and 0 ~
k>0,

(5) with ix = ¢r(xl(t)) c [0, 1].

x~(t+O)=x~(t)+p~(1-x~(t)),

(6)

The payoff (opposite of the cost) is taken to be A

e x p ( - p t ) x~(t) d t - I ~ C to

~ exp(-pU),

(7)

¢~1

where p, A, C > 0, and where t a, t z, . .. are the discontinuity points of xl(" ). From (5) and (6), it follows readily that, if 0 ~
(8)

We(x,/z) = / x [ C e x p ( - p t ) + h~(tc + 0)(1 - xl)],

(9)

}tl -- A e x p ( - p t ) + kh~, h,(tc) = (1 -/x)A,(tc + 0),

(10) tz = zr*(x).

(11)

Here, we shall limit the discussion to the case where there is no jump at the initial time to. Let {z ~, ~.2,...} denote the set of discontinuity points of x*(. ). Let t ~ { ~ "1, r2,...} and xl ( t ~ ) = x v Condition (ii) of Theorem 3.1 becomes C exp(-pt~)+)t~(tc+O)(1 -x~)~O,

(12)

since the converse would imply Ix = 0; consequently, on account of (6), one would have x * ( t c + O ) = x*(t~), in contradiction with the fact that tc is a discontinuity point of Xl*(" ). Taking account of (6), Condition (iii) of Theorem 3.1 can be written in the simplified form tz[pC - a ( 1 - x~) - kh,( t~ + 0) exp(pt~)] = 0.

(13)

Since/x ¢ 0, we deduce from (12) and (13) that A(1 - x~)2 _ p C (1 - x~) - k C >>-O.

(14)

JOTA: VOL. 46, NO. 4, AUGUST 1985

437

Since 0 ~ < x ~< 1, it follows from (14) that 1 I> 1 - x ~ > l - x 1 ,

(15)

where )71 is the smallest root of the left-hand trinomial of (14),

~a = 1 - (2A)-1[pC +~/(p2C2+4AkC)].

(16)

Equation (16) and the inequality 1 ~> 1 - 2 1 in expression (15) impose the condition

A>~ C ( p + k ) .

(17)

In the present example, we have

oh(x ~, tx )/Ol~ = O ~ ( x ~, ix )/Otz = C exp(-pt~) + Ai(t~ +0)(1 - x~). From Condition (ii) of Theorem 3.1, we have

oh(x c, #)/Otx < 0 ~

= 1,

oh(x ~,/z)/0/x = 0 ~ / x undetermined. Since, in the latter case, K~(x C) = (0, 1] is convex and

O~F~(x~,tx)/Otx=O,

for all ~ c (0, 1],

the conclusion of Corollary 3.1 applies. It follows that, in that case, /x = ~-*(x ~) can be arbitrarily redefined, provided that 0 < / x ~< 1, Accordingly, we shall let/x = 1 in that case. From (6) and (11), we have

f Al(t~)=o, Ix=cr*(xC)=l~ix*(t~+O)=l,

t =~',~2,....

The integration of (5) and (10) on an interval (~-~,r~+'), i~ {1, 2,...}, with the boundary conditions A~(r~+I)=0

and

x*(~-~+0)=l,

respectively, yields A1(~'~+0) = - A ( k + p ) -1 exp(-pr~){1 - e x p [ - ( k + p ) ( ~ "~+1- 7"~)]}, (18) x*(~"~1) = e x p [ - k ( ~ -/+1- r*)].

(19)

The substitution of (18) in (t3), for tc = r ~, results in

x*(~")=-pCA-l+(k+p)-l{p+kexp[-(k+p)(r~+'-r¢)]}, where r i = r 1, z 2, ....

(20)

438

JOTA: VOU 46, NO. 4, AUGUST 1985

If, as in the problem discussed by Case, one restricts the class of optimal paths to those which satisfy the additional conditions x~*(~') = x * ( ~ 2) . . . . .

x*~(~') = x*(~ '+') . . . . .

x~,

then x~ is given by conditions (19) and (20), which result in the equation x~ = - p C A -1 + ( k

+ p)-l[p -b k(xCl)(k+P)k-~].

(21)

One can easily verify that the approach of Case leads to the same conclusions.

5. Conclusions

This paper is an application of the geometrical approach to optimal control developed by Blaqui~re and Leitmann. It deals with dynamical systems which are subjected to feedback control by an agent, in such a way that the controlled evolution of the state exhibits both sudden jumps and continuous changes. The agent seeks to maximize (resp., minimize) a given payoff (resp., cost) in some planning period, whence the name impulsive optimal control. This class of problems, discussed from time to time in the literature (see the references below), is met in a number of optimal control models in economics and management. Theorem 3.1 gives necessary conditions for an impulsive feedback control to be optimal when the time horizon of the planning period is finite or infinite. It is complemented by Theorem 3.2 and Corollary 3.1. The use of Theorems 3.1, 3.2 and Corollary 3.1 is illustrated by an application to the innkeeper's problem.

References

1. BLAQUII~RE, A., Differential Games with Piecewise Continuous Trajectories, Differential Games and Applications, Edited by P. Hagedorn, H. W. Knobloch, and G. J. Olsder, Springer-Verlag, Berlin, Germany, 1977. 2. CASE, J. H., Economics and the Competitive Process, New York University Press, New York, New York, 1979. 3. VINCENT, T. L., and MASON, J. D., Disconnected Optimal Trajectories, Journal of Optimization Theory and Applications, Vol. 3, pp. 263-281, 1969. 4. CASE, J. H., Impulsively Controlled 19. G., The Theory and Application of Differential Games, Edited by J. D. Grote, D. Reidet Publishing Company, Dordrecht, The Netherlands, 1975. 5. BLAQUII~RE,A., Jeux Differentiels ~ Deux Joueurs, Somme Nulle, avec Trajectoires Discontinues, Comptes Rendus de l'Acadrmie des Sciences, S~rie A, Vol. 282, pp. 1047-1049, 1976.

JOTA: VOL. 46, NO. 4, AUGUST 1985

439

6. GEERING, I-I. P., Continuous-Time Optimal Control Theoryfor Cost Functionals Including Discrete State Penalty Terms, IEEE Transactions on Automatic Control, Vol. AC-21, pp. 866-869, 1976. 7. BLAQUII~RE, A., Necessary and Sufficiency Conditions for Optimal Strategies in Impulsive Control, Differential Games and Control Theory, III, Edited by P. T. Liu and E. Roxin, Marcel Dekker, New York, New York, 1979. 8. BLAQUI]~RE, A., Necessary and Sufficient Conditions for Optimal Strategies in Impulsive Control and Applications, New Trends in Dynamic System Theory and Economics, Edited by M. Aoki and A. Marzollo, Academic Press, New York, New York, 1979. 9. GETZ, W. M., and MARTTN, D. H., Optimal Control Systems with State Variable Jump Discontinuities, Journal of Optimization Theory and Applications~ Vol. 31, pp. 195-205, 1980. 10. BLAQUII~RE, A., and LEITMANN, G., On the Geometry of Optimal Processes, Parts I, II, III, University of California, Berkeley, IER Reports Nos. AM-64-10, 1964; AM-65-11, 1965; and AM-66-1, 1966. 11. BLAQUI~RE, A., and LEITMANN, G., On the Geometry of Optimal Processes, Topics in Optimization, Edited by G. Leitmann, Academic Press, New York, New York, 1967.

Impulsive optimal control with finite or infinite time horizon

Recommend Documents