Adaptive control of average Markov decision chains under the Lyapunov stability condition

This note concerns discrete-time Markov decision processes with denumerable state space. A control policy is graded by the long-run expected average ...
5 downloads 359 Views 302KB Size