c Pleiades Publishing, Inc., 2015. ISSN 0032-9460, Problems of Information Transmission, 2015, Vol. 51, No. 2, pp. 200–204. c A. Aghajan, S.J. Zahabi, M. Khosravifard, T.A. Gulliver, 2015, published in Problemy Peredachi Informatsii, 2015, Vol. 51, Original Russian Text No. 2, pp. 122–127.
SOURCE CODING
On the Source-Channel Separation Theorem for Infinite Source Alphabets A. Aghajana , S. J. Zahabia , M. Khosravifarda , and T. A. Gulliverb a
Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan, Iran e-mail :
[email protected],
[email protected],
[email protected] b Department of Electrical and Computer Engineering, University of Victoria, Victoria, BC, Canada e-mail :
[email protected] Received February 8, 2014; in final form, December 27, 2014 Abstract—The single-user source-channel separation theorem has been proved for many classes of sources and channels, including sources with finite or countably infinite alphabets. Typically, the source-channel separation theorem is first proved for sources with a finite alphabet, and then the results are extended to sources with a countably infinite alphabet. This paper considers the direct extension of the source-channel separation theorem for some classes of sources with a finite alphabet to a countably infinite alphabet. Specifically, we provide a solution for memoryless sources and arbitrary channels. It is then discussed how this approach may be extended to the case of general sources and arbitrary channels. DOI: 10.1134/S0032946015020106
1. INTRODUCTION The two main branches of information theory are source coding and channel coding. Separating source and channel coding, it is known that a given source can be sent reliably over a channel if the infimum achievable source coding rate of the source is strictly less than the capacity of the channel [1]. Hence, a fundamental question is whether joint source-channel coding can provide additional benefits. In this regard, Shannon proved that for the class of stationary memoryless sources and channels, nothing extra can be achieved by combining source and channel coding, inducing the concept of separation [2]. Since then, a theorem stating that joint source-channel coding provides no extra benefit, in terms of asymptotically vanishing error probabilities, is referred to as a source-channel separation theorem. The source-channel separation theorem has been extended to other classes of sources and channels [3–5]. However, in [6], a source-channel pair is provided for which the separation theorem does not hold. Thus, inspecting the validity of the source-channel separation theorem for different classes of sources and channels is of theoretical interest. With this preamble, the motivation for this paper is given below. 1.1. Motivation In the literature, for a specific class of sources, the source-channel separation theorem has been examined considering finite or countably infinite alphabets. However, the proofs for countably infinite alphabets seem to require significant effort, and cannot be readily obtained from previous results based on finite alphabets. In other words, the source-channel separation theorem is usually 200
ON THE SOURCE-CHANNEL SEPARATION THEOREM
201
proved first for sources with a finite alphabet, and then extended with some effort to sources with a countably infinite alphabet. This is evident in the following three examples. •
•
•
In 1948, Shannon proved the source-channel separation theorem for the class of stationary memoryless sources with a finite alphabet [2]. Then, Dobrushin in 1963 [3] and Pinsker in 1964 [4] independently proved the source-channel separation theorem for information stable1 sources with countably infinite alphabets. The source-channel separation theorem is commonly proved using Fano’s inequality for stationary ergodic sources with finite alphabet and memoryless channels [8]. Note that Fano’s inequality in its traditional form enables such a proof only for sources with a finite alphabet. However, a generalized form of Fano’s inequality was recently introduced in [9], with which the separation theorem can be extended to countably infinite alphabets. In 1995, the separation theorem was examined for a class of general sources and channels, considering a finite alphabet for the sources [6]. This was subsequently generalized by Han [7] to countably infinite alphabets.
These results motivate the search for a means of avoiding the extra effort required to prove the separation theorem for sources with a countably infinite alphabet. In other words, we wish to answer the following question: Is it possible to directly prove the separation theorem for some classes of sources and channels with countably infinite source alphabets, knowing that the theorem holds for this class of sources with finite alphabets? In this paper, we provide a positive answer to this question for memoryless sources. We then discuss the possibility of extending this idea to general sources. The problem considered here is now stated formally. 1.2. Problem Statement Let V = (V1 , V2 , . . .) denote a general source where (V1 , . . . , Vn ) is a random variable2 over the nth Cartesian product V n of an arbitrary source alphabet V (which can be countably infinite), with probability distribution given by PV1 ,...,Vn (·). Let R(V ) denote the the infimum achievable fixed length coding rate for the source V . Definition. For any set of sources V (with possibly a countably infinite alphabet), and any channel W with capacity C(W ), we say the source-channel separation theorem holds for the pair (V, W ) if the following statement is true: For any V ∈ V, it is not possible to transmit V over W with an arbitrarily low probability of error if R(V ) > C(W ). For a set of sources V, let Vf ⊂ V denote the sources that have a finite alphabet. Further, let V∞ V \ Vf . The problem considered in this paper can now be stated as follows: Given a set of sources V and an arbitrary channel W , prove that the source-channel separation theorem holds for (V∞ , W ), assuming that it holds for (Vf , W ). As we will see in the following sections, the results obtained in this paper are independent of the channel considered. Therefore, throughout this paper the expression arbitrary channel W means an arbitrary general channel as described in [7], unless stated otherwise. The remainder of this paper is organized as follows. In Section 2, the problem is considered for memoryless sources. Then in Section 3, the problem is discussed for general sources. Finally, Section 4 concludes the paper. 1 2
For the definition and details on information stability, we refer the reader to [7]. The vector (V1 , . . . , Vn ) is usually abbreviated by V n , however we use the superscript (·)m to represent something else, so we avoid using V n to represent (V1 , . . . , Vn ). PROBLEMS OF INFORMATION TRANSMISSION
Vol. 51 No. 2 2015
202
AGHAJAN et al.
2. MEMORYLESS SOURCES Consider the set of all memoryless sources denoted by M = Mf ∪M∞ . Let P = (p1 , p2 , . . . , pn , . . .) be the probability mass function (PMF) of a memoryless source V ∈ M∞ , where ∀i, pi ≥ pi+1 . Since V is memoryless, we have R(V ) = H(V ) where H(V ) = − pi log pi . Further, let V m ∈ Mf
i
be a memoryless source with PMF given by Pm = (p1 , p2 , . . . , pm−1 , pm ), where pm = pi . Then i≥m it is not hard to show that (1) lim H(V m ) = H(V ). m→∞
Theorem. Given an arbitrary channel W , the source-channel separation theorem holds for (M∞ , W ) if it holds for (Mf , W ). Proof. Without loss of generality, we suppose that the alphabet of the source V is N = {1, 2, 3, . . .}, and the probability of symbol j is pj , i.e. P = (p1 , p2 , . . . , pn , . . .). The proof is by contradiction. Assume that Theorem 2 is false, i.e. the separation theorem holds for (Mf , W ), but for a source V ∈ M∞ with H(V ) > C(W ), there exists a joint source-channel code with probability of error
Pr
V1 , . . . , Vn = V1 , . . . , Vn
→ 0,
(2)
as n → ∞, where Vi is the estimate of Vi at the receiver and Pr{·} denotes probability. We refer to this code as C2 . From (1), there exists an M such that H(V M ) > C(W ). As stated before, the PMF of V M is PM . If we can show that there exists a joint source-channel code with probability of error (3) Pr V1M , . . . , VnM = V1M , . . . , VnM → 0, as n → ∞, by employing C2 , then the source V M can be sent over the channel with an arbitrarily low probability of error, even though it has entropy greater than C(W ). Since V M ∈ Mf , this contradicts the assumption that the source-channel separation theorem holds for (Mf , W ). Thus the initial assumption on the existence of C2 is false, which would complete the proof. Following the above result, we claim that V M can be sent through the channel by means of C2 . To prove this, consider the random pre-code C1 on V M defined as follows:
C1 : {1, 2, . . . , M } → N,
C1 (ViM )
Ui =
=
ViM , ViM < M, ViM = M, Yi ,
(4)
where the Yi are discrete independent and identically distributed (i.i.d.) random variables with PMF given by py (5) Pr {Yi = yi } = i , yi = M, M + 1, . . . . pk k≥M
Thus, the random pre-code C1 maps the M th symbol of ViM to an index y ∈ {M, M + 1, . . .} with an appropriate probability, and leaves the other M − 1 symbols of ViM unchanged, as shown in the figure. The decoder C1 then simply reverses the mapping in (4). Obviously, the PMF of Ui is P . Considering U = (U1 , U2 , . . .) as a source, it can be sent using C2 , yielding Pr
1 , . . . , U n = U1 , . . . , Un U
= Pr
V1 , . . . , Vn = V1 , . . . , Vn
.
(6)
1 , . . . , U n ) = (U1 , . . . , Un ), then (V M , . . . , V M ) = (V M , . . . , V M ), Because of the structure of C1 , if (U n n 1 1 and thus
Pr
1 , . . . , U n = U1 , . . . , Un U
≤ Pr
V1M , . . . , VnM = V1M , . . . , VnM
.
Hence, combining (2), (6), and (7) completes the proof. PROBLEMS OF INFORMATION TRANSMISSION
Vol. 51 No. 2 2015
(7)
ON THE SOURCE-CHANNEL SEPARATION THEOREM p1 p2
p1 p2
p3
p3
.. .
.. .
.. .
Ui
C2
pM
pM
Channel W
203
Decoder Uˆi Decoder VˆiM C1 C2
pM+1 ViM
.. . C1
pM+2
.. .
Sending V M over the channel W by means of C1 and C2 .
3. DISCUSSION: EXTENSION TO GENERAL SOURCES Consider the set of all general sources denoted by G = Gf ∪ G∞ . In order to extend the idea of the proof stated in the previous section to the case of V ∈ G∞ , one requires a sequence of general finite sources V 1 , V 2 , . . . ∈ Gf that have the following two properties: (a) For each V m in the sequence, there exists a random pre-code that maps V m to V ; (b) The sequence R(V 1 ), R(V 2 ), . . . converges to R(V ).3 In the following, we first propose a sequence of sources V 1 , V 2 , . . . , then we provide a random pre-coding by which property (a) is satisfied for the proposed sequence of sources. Property (b) however, remains as an open problem. Hereafter, for notational convenience, we denote (x1 , . . . , xn ) by x and (y1 , . . . , yn ) by y. Consider a general source V = (V1 , V2 , . . .) ∈ G∞ with countably infinite alphabet N = {1, 2, . . .}, and arbitrary probability distribution PV1 ,...,Vn (·). Consider the vector function F m : Nn → {1, . . . , m}n ,
F m (y1 , . . . , yn ) = (f (y1 ), . . . , f (yn )) ,
(8)
where f (y) = y, for y < m, and f (y) = m, for y ≥ m. Now, consider the sequence of sources m is a general source with finite alphabet {1, 2, . . . , m}, {V m = (V1m , V2m , V3m , . . .)}∞ m=1 , where V and probability distribution given by PV1m ,...,Vnm (x) =
PV1 ,...,Vn (y).
(9)
y∈Nn : F m (y)=x
Regarding the random pre-coding, the idea here is similar to that of memoryless sources, with some slight differences. Following the same approach of proof by contradiction, here we have the joint probability distribution of V1 , V2 , . . . , Vn , i.e. PV1 ,...,Vn (y), at the channel input. Hence, for any sequence of length n, it suffices to provide a joint random pre-code for V1m , V2m , . . . , Vnm whose output is U1 , U2 , . . . , Un such that PU1 ,...,Un (y) = PV1 ,...,Vn (y). Now, consider the random 3
For general sources we have R(V ) = H(V ), where H(V ) is the spectral sup-entropy rate of V as defined in [10]. PROBLEMS OF INFORMATION TRANSMISSION
Vol. 51 No. 2 2015
204
AGHAJAN et al.
pre-code C1n given by C1n : {1, 2, . . . , M }n → Nn ,
(U1 , . . . , Un ) = C1n (V1M , . . . , VnM ) =
(V1M , . . . , VnM ), ∀i, ViM < M, otherwise, (Y1 , . . . , Yn ),
(10)
where (Y1 , . . . , Yn ) is a discrete random vector with joint PMF given by Pr {(Y1 , . . . , Yn ) = (y1 , . . . , yn )} =
PV1 ,...,Vn (y)
.
(11)
PV1m ,...,Vnm F m (y)
Regarding the convergence of the proposed sequence V 1 , V 2 , . . . , considering its structure, it is intuitively expected that the source V m approaches V as m tends to infinity. Therefore, it seems that R(V m ) tends to R(V ) as m → ∞. Proving this proposition however, is not simple. Nonetheless, it can be shown for the proposed sequence that R(V m ) is increasing in m, and upper bounded by R(V ), so it does have a limit. However, proving that the limit is in fact R(V ) remains as an open problem for future research. Therefore, we end this section with the following conjecture. Conjecture. Given an arbitrary channel W , the source-channel separation theorem holds for (G∞ , W ) if it holds for (Gf , W ). 4. CONCLUSION A simple proof of the source-channel separation theorem for memoryless sources with countably infinite alphabet and arbitrary channels was provided. In particular, it was shown that the separation theorem is readily obtained for memoryless sources with a countably infinite alphabet, once it is proved for memoryless sources with a finite alphabet. REFERENCES 1. Han, T.S., Joint Source-Channel Coding Revisited: Information-Spectrum Approach, http://arXiv: 0712.2959v1 [cs.IT], 2007. 2. Shannon, C.E., A Mathematical Theory of Communication, Bell Syst. Tech. J., 1948, vol. 27, no. 3, pp. 379–423; no. 4, pp. 623–656. 3. Dobrushin, R.L., General Formulation of Shannon’s Main Theorem in Information Theory, Uspekhi Mat. Nauk, 1959, vol. 14, no. 6, pp. 3–104 [Trans. Amer. Math. Soc., Ser. 2 (Engl. Transl.), 1963, vol. 33, pp. 323–438]. 4. Pinsker, M.S., Informatsiya i informatsionnaya ustoichivost’ sluchainykh velichin i protsessov, Probl. Peredachi Inf., issue 7, Moscow: Akad. Nauk SSSR, 1960. Translated under the title Information and Information Stability of Random Variables and Processes, San Francisco: Holden-Day, 1964. 5. Hu, G.D., On Shannon Theorem and Its Converse for Sequence of Communication Schemes in the Case of Abstract Random Variables, in Proc. 3rd Prague Conf. on Information Theory, Statistical Decision Functions, Random Processes, Liblice, Czechoslovakia, June 5–13, 1962, Prague: Czechoslovak Acad. Sci., 1964, pp. 285–333. 6. Vembu, S., Verd´ u, S., and Steinberg, Y., The Source-Channel Separation Theorem Revisited, IEEE Trans. Inform. Theory, 1995, vol. 41, no. 1, pp. 44–54. 7. Han, T.S., Information-Spectrum Method in Information Theory, New York: Springer, 2003. 8. Cover, T.M. and Thomas, J.A., Elements of Information Theory. New York: Wiley, 2006, 2nd ed. 9. Ho, S.-W. and Verd´ u, S., On the Interplay between Conditional Entropy and Error Probability, IEEE Trans. Inform. Theory, 2010, vol. 56, no. 12, pp. 5930–5941. 10. Han, T.S. and Verd´ u, S., Approximation Theory of Output Statistics, IEEE Trans. Inform. Theory, 1993, vol. 39, no. 3, pp. 752–772. PROBLEMS OF INFORMATION TRANSMISSION
Vol. 51 No. 2
2015