c Pleiades Publishing, Ltd., 2012. ISSN 0012-2661, Differential Equations, 2012, Vol. 48, No. 11, pp. 1502–1509. c A.B. Kurzhanski, A.I. Mesyats, 2012, published in Differentsial’nye Uravneniya, 2012, Vol. 48, No. 11, pp. 1525–1532. Original Russian Text
CONTROL THEORY
Optimal Control of Ellipsoidal Motions A. B. Kurzhanski and A. I. Mesyats Moscow State University, Moscow, Russia Received May 28, 2012
Abstract—We study the target control problem for systems with ellipsoid-valued trajectories admitting reconfiguration of the ellipsoids in the course of motion. We present solutions for linear-convex systems in the class of positional (synthesized) controls under integral-quadratic motion performance criteria. We use Hamiltonian formalism methods, including the dynamic programming equations for such systems. DOI: 10.1134/S0012266112110080
1. INTRODUCTION In the present paper, we consider control differential systems with set-valued trajectories whose values are nondegenerate ellipsoids. For such systems, we suggest target control problems that admit solutions in the class of synthesized (positional) strategies. In these problems, not only the motions of the centers of these ellipsoids but also their configuration, under possible constraints on their size, should be controlled. In particular, we consider systems of linear original structure with integral (soft) motion performance criteria. We suggest criteria imposing exterior and interior constraints on the size of the ellipsoidal trajectory tubes in the form of a requirement that they should lie between given exterior and interior constraining cylinders whose bases are ellipsoids or disks. Here the statements of the problems necessitate considering matrix equations [1] of motion and the corresponding optimality conditions. Part of the controls may sometimes be represented in matrix form as well. The solution methods are based on the computation of the cost function with the use of Hamiltonian dynamic programming equations of the corresponding dimension. Such statements are motivated by control problems with really uncertain information on the motion parameters [2, 3]. They are also important when studying problems of collective (“group”) control of a given set of isolated motions, where the ellipsoids serve as a virtual container for the elements of the set. 2. STATEMENT OF THE PROBLEMS On the time interval [t0 , θ], consider the system q(t0 ) = q0 , q(t) ˙ = A(t)q(t) + Bq (t)u(t), ˙ Q(t) = T (t)Q(t) + Q(t)T (t) + B(t)U (t)B (t),
Q(t0 ) = Q0 ,
(1)
where q ∈ Rn and Q ∈ Rn×n are the state variables, u ∈ Rm1 and U ∈ Rm2 ×m2 are the controls, and the matrix parameters A, T ∈ Rn×n , Bq ∈ Rn×m1 , and B ∈ Rn×m2 are assumed to be continuously differentiable. The prime stands for transposition. The problem is to find positional controls u(t, q) and U (t, Q) minimizing the functional θ (U (t, Q(t)), U (t, Q(t)) + u(t, q(t)), u(t, q(t)) dt
Ψ(u(·), U (·)) = t0
+ q(θ) − m, D(q(θ) − m) + Q(θ) − M, D(Q(θ) − M ) 1502
(2)
OPTIMAL CONTROL OF ELLIPSOIDAL MOTIONS
1503
on the trajectories of system (1). Here D = D > 0 and D = D > 0. The inner product of matrices A and B is defined as A, B = tr(B A) and induces the Frobenius norm on Rn×n [4, pp. 239–242]. Problem (1), (2) is also considered under the additional state constraints Q, Q ≤ λ2+ ,
λ+ > 0,
(3)
Q, Q ≥
λ+ > λ− > 0.
(4)
λ2− ,
Note that inequality (3) defines a convex constraint, and inequality (4) defines a constraint complementary to a convex one. These inequalities restrict the possible size of an ellipsoid with configuration matrix Q. Condition (3) implies that the ellipsoid should lie in a ball of radius λ+ /n, and condition (4) implies that no ball of radius less than λ− /n contains the ellipsoid. One can determine these radii by using the equivalence of the Frobenius norm and the spectral norm. 3. SOLUTION IN THE ABSENCE OF STATE CONSTRAINTS To solve the problem, we rewrite the second equation in (1) in vector form by using the Kronecker product [5, p. 264 of the Russian translation]. By Q we denote the arrangement of a matrix Q row by row into a column vector, A = [aij ]ni,j=1 ,
A = [a11 , a12 , . . . , a1n , a21 , . . . , an1 , an2 , . . . , ann ] .
By using the identity AXB = (A ⊗ B )X, where ⊗ is the Kronecker (tensor) product of matrices, we obtain the relation ˙ = (T ⊗ I)Q + (I ⊗ T )Q + (B ⊗ B)U ≡ AQ + BU . Q Then the functional (2) can be represented in the form θ (U (t, Q(t)), U (t, Q(t)) + u(t, q(t)), u(t, q(t))) dt
Ψ(u(·), U (·)) = t0
+ q(θ) − m, D(q(θ) − m) + Q(θ), (D ⊗ I)Q(θ) + Q(θ), DM + M, DM . We solve the problem by the dynamic programming method. A position of the system is understood as a triple {t, q, Q} in the space [t0 , θ] × Rn × Rn×n . We introduce the cost function V (t0 , q0 , Q0 ) = min {Ψ(u(·), U (·))| q(t0 ) = q0 , Q(t0 ) = Q0 }. u(·),U (·)
This function is a solution of the Hamilton–Jacobi–Bellman equation [2] ∂V ∂V ∂V , AQ + BU + U (t), U (t) + u(t), u(t) = 0 + min , Aq + Bq u + ∂t ∂q u,U ∂Q
(5)
with the terminal condition V (θ, q, Q) = q(θ) − m, D(q(θ) − m) + Q(θ), (D ⊗ I)Q(θ) + Q(θ), DM + M, DM .
(6)
Then the minimum in (5) is attained at the optimal controls 1 ∂V , u = − Bq 2 ∂q
1 ∂V . U = − B 2 ∂Q
By substituting the controls u and U into Eq. (5), we obtain the relation 1 ∂V ∂V ∂V 1 ∂V ∂V ∂V ∂V , AQ − , BB + + , AQ − , Bq Bq = 0. ∂t 4 ∂Q ∂q 4 ∂q ∂q ∂Q ∂Q DIFFERENTIAL EQUATIONS
Vol. 48
No. 11
2012
(7)
1504
KURZHANSKI, MESYATS
The resulting expression is a quadratic form in the state coordinates and the spatial derivatives of the cost function. The latter permits one to seek the cost function in quadratic form as well, V (t, q, Q) = q, P (t)q + Q, P(t)Q + q, k(t) + Q, κ(t) + γ(t).
(8)
By substituting this quadratic form into the system and by matching the coefficients of like powers of q and Q, we obtain the system of equations P˙ + A P + P A − P Bq Bq P = 0, k˙ + A k + P Bq Bq k = 0, P˙ + A P + PA − PBB P = 0,
κ˙ + A κ + PBB κ = 0, 1 γ˙ − (k, Bq Bq k + κ, BB κ) = 0, 4
P (θ) = D,
(9)
k(θ) = −2Dm,
(10)
P(θ) = D ⊗ I, κ(θ) = −2DM ,
(11) (12)
γ(θ) = m, Dm + M, DM .
(13)
Returning to the original notation, we rewrite Eqs. (11)–(13) in the form P˙ + (T ⊗ I + I ⊗ T )P + P(T ⊗ I + I ⊗ T ) − P(BB ⊗ BB )P = 0, κ˙ + (T ⊗ I + I ⊗ T )κ + P(BB ⊗ BB )κ = 0, 1 γ˙ − (k, Bq Bq k + κ, (BB ⊗ BB )κ) = 0, 4
P(θ) = D ⊗ I,
κ(θ) = −2DM ,
(14)
γ(θ) = m, Dm + M, DM .
We have thereby obtained equations for the parameters of the form (8). Under the condition of continuous differentiability of the parameters of the system, they uniquely determine a classical solution of problem (5), (6), which is unique. We have thereby proved the following assertion. Theorem 1. The cost function (8) in which the parameters are determined by system (14) specifies a solution of the optimization problem (1), (2). In this case, the synthesizing optimal controls u(t, q) and U (t, Q) are given by (7).
4. RETURNING TO MATRIX NOTATION Equations (14) have been written out by rewriting matrices as vectors containing all of their entries row by row. Next, let us find out whether it is possible to obtain matrices P(t) ∈ Rn×n and 2 2 2 K(t) ∈ Rn×n via the parameters P(t) ∈ Rn ×n and κ(t) ∈ Rn so as to satisfy the relations Q, P(t)Q = Q, P(t)Q,
Q, κ(t) = Q, K(t)
for all Q, t,
(15)
i.e., whether it is possible to find matrix solutions without first rearranging the solution into a column vector. If so, how are the parameters P(t) and K(t) related to Eqs. (14)? This possibility would permit one not only to remain in the class of matrices when solving the problem but also to reduce the dimension of the equations in system (14) corresponding to the parameters of the matrix form from n4 to n2 . By using the relation A, B = A, B, we conclude that the first relation in (15) is equivalent to the representation P(t) = P(t) ⊗ I for all t. (16) If this relation holds for the parameter K(t), K(t) = κ(t), then it follows from (14) that K˙ + T K + KT + PBB KBB = 0,
K(θ) = −2DM.
(17)
Then the equation for γ(t) acquires the form 1 γ˙ − (k, Bq Bq k + K, BB KBB ) = 0, 4
γ(θ) = m, Dm + M, DM .
DIFFERENTIAL EQUATIONS
Vol. 48
No. 11
(18) 2012
OPTIMAL CONTROL OF ELLIPSOIDAL MOTIONS
1505
Therefore, we need to verify the possibility of the representation (16). It turns out that this representation does not hold in general. Indeed, should it be true for P, we would have P−1 = P −1 (t) ⊗ I. Then it follows from (14) that d(P−1 ) + P−1 (T ⊗ I + I ⊗ T ) + (T ⊗ I + I ⊗ T )P−1 − BB ⊗ BB = 0. dt If we take
T =
0 0
,
B=
0 0
1 1
,
0 1
then, for arbitrary initial data D, we obtain (P−1 (t))12 = 0 for t = 0. Therefore, the representation (16) does not hold. Let us present sufficient conditions for the representation (16) to be true. It is equivalent to ˙ ˙ the two conditions P(θ) = P(θ) ⊗ I and P(t) = P(t) ⊗ I. Note that, by (14), the first condition is satisfied and P(θ) = D. Let the representation (16) hold until time t. Consider the derivative at time t, P˙ + (T P + PT ) ⊗ I + P ⊗ (T + T ) + (PBB P) ⊗ (BB ) = 0. Lemma 1. The relation P ⊗ (T + T ) + (PBB P) ⊗ (BB ) = f (t, Q) ⊗ I can hold for all P if and only if B(t)B (t) = ν(t)I, t ∈ [t0 , θ], (19) T (t) + T (t) = μ(t)I, where μ(t) and ν(t) are scalar functions. Proof. The sufficiency is obvious. The necessity follows from the fact that the matrix f (t, Q)⊗I consists of the diagonal blocks (f (t, Q))ij I, which are diagonal matrices whose all diagonal entries are the same. Suppose that either T (t) + T (t) or B(t)B (t) has a nonzero off-diagonal entry at position (i, j). Take P = kI, where k is a number. Then the sum of Kronecker products has block-diagonal form with blocks k(T + T ) + k2 (BB )p,q BB , and there exists a k such that the block contains a nonzero entry at position (i, j). In a similar way, one can show that the diagonals contain equal entries. The proof of the lemma is complete. We have thereby shown that relation (16) holds for every possible P if and only if the matrices T (t) and B(t) satisfy relation (19). In this case, Eqs. (14) acquire the form P˙ + PT + T P − ν 2 P 2 = 0,
K˙ + T K + KT + ν KP = 0, 1 γ˙ − (k, Bq Bq k + ν 2 K, K) = 0, 4 2
P(θ) = D, K(θ) = −2DM,
(20)
γ(θ) = m, Dm + M, DM .
It follows from condition (19) that, at each time t, the control in the system is either absent (B = 0) or affects each component of the system. (The dimension of the control should coincide with the dimension of the state variable, m1 = n.) 5. SOLUTION IN THE CASE OF STATE CONSTRAINTS Let us describe a solution taking into account the constraints (3) and (4). For example, consider only the second equation in (1). To this end, we construct a functional representing the violation of the state constraints by a trajectory in such a way that the corresponding cost function is still a quadratic form. Consider the expression aQ − αI, Q − αI − bQ − βI, Q − βI, DIFFERENTIAL EQUATIONS
Vol. 48
No. 11
2012
1506
KURZHANSKI, MESYATS
where a, b, α, and β are positive constant parameters. Next, we have aQ − αI, Q − αI − bQ − βI, Q − βI = (a − b)Q, Q − 2(aα − bβ)Q, I + (α2 − β 2 )n n n
= (a − b) λ2k (Q) − 2(aα − bβ) λ2k (Q) + (aα2 − bβ 2 )n k=1
k=1
n
((a − b)λ2k (Q) − 2(aα − bβ)λ2k (Q) + (aα2 − bβ 2 )). = k=1
Now we choose the parameters so as to ensure that the minimum of each term is attained in the range of the state constraint. To this end, we require that a − b > 0,
λ− ≤
aα − bβ ≤ λ+ . b−a
(21)
Such parameters can always be chosen. We make the corresponding addition to the functional, θ (U (t, Q(t)), U (t, Q(t)) + aQ − αI, Q − αI − bQ − βI, Q − βI) dt
Ψa,b,α,β (U (·)) = t0
+ Q(θ) − M, D(Q(θ) − M ).
(22)
By repeating the above-represented steps, we find that the cost function can be sought in the form (8), where the parameters P, K(t), and γ(t) satisfy the equations P˙ + PT + T P − ν 2 P 2 + (a − b)I = 0, K˙ + T K + KT + ν 2 KP + 2(aα − bβ)I = 0, 1 γ˙ − K, Kν 2 + (aα2 − bβ 2 )n = 0, 4
P(θ) = D ⊗ I, κ(θ) = −2DM , γ(θ) = M, DM .
Now let us find a solution for constraints of the form (3). To satisfy the constraints, we take them into account in integral form in the matrix part of the functional. We have
θ
(α(t)(Q(t), Q(t) − λ2+ )+ + U, U ) dt + Q(θ) − M, D(Q(θ) − M ) ,
min max U (·) α(·) t0
where α(t) ≥ 0, t ∈ [t0 , θ]. Here the notation (a)+ for a ∈ R implies the operation of truncated difference,
(a)+ = a for a ≥ 0 0 for a < 0. By following [6], we can interchange the maximum and minimum and obtain a Hamilton–Jacobi– Bellman equation similar to (5) with the same terminal condition (6). To solve the problem, one should then carry out the optimization with respect to α, V (t, q, Q) = max Vα (t, q, Q). α(·)
Note that the maximizer α(·) in general depends on the position (t, Q). The cost function can be sought in a form similar to (8), where Vα (t, q, Q) q, P (t)q + Q, P(t)Q + q, k(t) + Q, K(t) + γ(t) for Q(t), Q(t) − λ+ ≤ 0 (23) = q, P (t)q + Q, Pα (t)Q + q, k(t) + Q, K(t) + γα (t) for Q(t), Q(t) − λ+ > 0. DIFFERENTIAL EQUATIONS
Vol. 48
No. 11
2012
OPTIMAL CONTROL OF ELLIPSOIDAL MOTIONS
1507
The new parameters Pα (t) and γα (t) satisfy the equations P˙ α + Pα T + T Pα − ν 2 Pα2 + αI = 0, 1 γ˙ α − (k, P Bq Bq P k + ν 2 K, Pα Kα Pα ) + αλ2+ = 0, 4
Pα (θ) = D, γ(θ) = m, Dm + M, DM .
(24)
Note that in general the resultant solution is only a generalized solution of the Hamilton–Jacobi– Bellman equation [7, pp. 55–95; 8, Chaps. 1–2]. If it is unique, then, by virtue of the linearity of the system and the convexity of the functional [9, p. 52], one can understand the solution in the classical sense by replacing the inner product of the gradient of the cost function by the right-hand side with the directional derivative, for which the theorem on the differentiation of the maximum is valid. In the previous solution, the parameter α was an analog of a Lagrange multiplier. Now let us describe a solution for which this parameter plays the role of a control. We introduce the cost function θ V (t, Q) =
(α(λ2+ −Q, Q)+β(Q, Q−λ2− )+U, U ) dt+Q(θ)−M, D(Q(θ)−M ) ,
min
U (·),α(·),β(·) t0
where α(·) and β(·) are positional controls satisfying the geometric constraint α(t, Q) ∈ [0, αmax ], β(t, Q) ∈ [0, βmax ]. We obtain the Hamilton–Jacobi–Bellman equation ∂V ∂V ∂V 2 2 , AQ + BU + α(λ+ − Q, Q) + β(Q, Q − λ− ) + U , U = 0, (25) + min + ∂t ∂t U,α,β ∂Q with the terminal condition (6). Set Φ+ (Q) = λ2+ −Q, Q and Φ− (Q) = Q, Q−λ2− . The minimum in (25) is attained at the optimal controls 1 ∂V 0 for Φ+ (Q) ≥ 0 0 for Φ− (Q) ≤ 0 , α(t, Q) = β(t, Q) = U(t, Q) = − B αmax for Φ+ (Q) < 0, βmax for Φ− (Q) > 0. 2 ∂Q (26) The solution can again be sought in a form similar to (8) and (23), but in this case, the form contains four (rather than two) sets of equations for the parameters of the form (24) corresponding to various sets of signs of Φ+ (Q) and Φ− (Q). 6. EXAMPLES Consider the following parameters of the dynamics (1) of the configuration matrix for a solution that uses the functional (22) : −4t 0 cos(πt) − sin(πt) θ = 1. T = , B= , t0 = 0, 0 −4t sin(πt) cos(πt) These matrices satisfy condition (19). The matrices in the terminal part of the functional (22) and the initial condition have the form √ 1 0 1 −0.1 1 −1 . D= , M= , Q0 = 2 0 1 −0.1 1 −1 2 √ √ The state constraints are given by the constants λ− = 2 and λ+ = 2 2. The problem was solved for the functional (22) with parameters a = 9, b = 5, and α = β = 8.4853. These values satisfy condition (21). The ellipsoidal tubes of trajectories with and without regard of the state constraints are shown in Fig. 1. DIFFERENTIAL EQUATIONS
Vol. 48
No. 11
2012
1508
KURZHANSKI, MESYATS
Fig. 1. Ellipsoidal tubes for the ellipsoid E(0, Q(t)) with no regard of the state constraints (a) and with regard of them (b). Separate sets of black disks represent the interior state constraint.
Fig. 2. Ellipsoidal tubes for the ellipsoid E(0, Q(t)) without regard of state constraints (a) and with regard of them (b). Separate sets of black disks show the exterior state constraint.
2. Consider the parameters −1 −1 T = , 1 −1
B=
1 0
,
0 1
t0 = 0,
θ = 1,
with the matrix in the terminal part of the functional (22) and the initial condition √ 1 0.4 1 −0 1 0 . D= , M= , Q0 = 2 0.4 1 0 1 0 2 √ √ The state constraints are given by the constants λ− = 1.5 2 and λ+ = 9 2. The problem was solved for the functional (22) with parameters a = 3.1, b = 1, and α = β = 2.69. The corresponding illustrations are shown in Fig. 2. All computations were carried out with the use of the Matlab software package. 7. CONCLUSION In the present paper, we have represented the solution of the optimization problem (1), (2) in the class of synthesizing positional controls. We described the class of systems in which the solution can be represented in matrix form. We suggested schemes for taking into account the state constraints (3) and (4) representing state constraints for an ellipsoid with the corresponding configuration matrix. We presented examples of the numerical finding of solutions with the use of our algorithms. DIFFERENTIAL EQUATIONS
Vol. 48
No. 11
2012
OPTIMAL CONTROL OF ELLIPSOIDAL MOTIONS
1509
ACKNOWLEDGMENTS This work is supported by the Russian Federal Program “Scientific and Pedagogical Staff of Innovative Russia in 2009–2013” (contract no. 16.740.11.0426 of 11/26/2010) and the program “State Support of the Leading Scientific Schools” (no. NS-2239.2012.1). REFERENCES 1. Chebunin, I.V., Controllability Conditions for the Riccati Equation, Differ. Uravn., 2003, vol. 39, no. 12, pp. 1654–1661. 2. Kurzhanski, A.B. and Varaiya, P., Optimization of Output Feedback Control under Set-Membership Uncertainty, J. Optim. Theory Appl., 2011, vol. 151, pp. 11–32. 3. Kurzhanski, A.B. and Varaiya, P., On Synthesizing Target Controls under Obstacle and Collision Avoidance, J. Franklin Inst., 2010, February, vol. 347, no. 1, pp. 130–145. 4. Tyrtyshnikov, E.E., Matrichnyi analiz i lineinaya algebra (Matrix Analysis and Linear Algebra), Moscow, 2007. 5. Bellman, R., Introduction to Matrix Analysis, New York: McGraw-Hill Book Co., 1960. Translated under the title Vvedenie v teoriyu matrits, Moscow: Izdat. “Nauka,” 1969. 6. Fan Ky, Minimax Theorems, Proc. Nat. Acad. of Sci., 1953, vol. 39, no. 1, pp. 42–47. 7. Subbotin, A.I., Generalized Solutions of First-Order PDE’s. The Dynamic Optimization Perspective, Boston: SCFA, 1995. 8. Fleming, W.H. and Soner, H.M., Controlled Markov Processes and Viscosity Solutions, New York, 1993. 9. Polovinkin, E.S. and Balashov, M.V., Elementy vypuklogo i sil’no vypuklogo analiza (Elements of Convex and Strongly Convex Analysis), Moscow, 2007. 10. Kurzhanski, A.B., Upravlenie i nablyudenie v usloviyakh neopredelennosti (Control and Observation under Conditions of Uncertainty), Moscow: Nauka, 1977.
DIFFERENTIAL EQUATIONS
Vol. 48
No. 11
2012