Telecommun Syst (2012) 49:199–205 DOI 10.1007/s11235-010-9368-1
Network covert timing channel with distribution matching Guangjie Liu · Jiangtao Zhai · Yuewei Dai
Published online: 11 August 2010 © Springer Science+Business Media, LLC 2010
Abstract Network covert timing channel is a communication fashion that modifies the timing properties of network traffic to transfer secret information. It is designed to carry out the reliable and undetectable transmission. In this paper, a simple and secure covert timing channel method with distribution matching is proposed. The approach treats the network traffic as the flow with the fixed-length fragment, and calculates the histogram of the packet delays in each fragment. The message bits are modulated into the delays by the binary coding method, and the histogram is kept almost unchanged by assigning the matched distribution. The bit error rates are analyzed and two detection experiments are performed. The results show the proposed method is reliable and undetectable. Keywords Covert timing channel · Network steganography · Regularity test · Corrected conditional entropy
1 Introduction When a person wants to communicate with others securely and covertly, he should hide the data into the normal medium in order to evade the check from the censor or the corresponding devices. In past about twenty years, the technology using the common multimedia information or some electronic files has achieved rapid development. This kind of technology is commonly called digital steganography. While as we all know, the contrary steganlaysis technology G. Liu () · J. Zhai · Y. Dai School of Automation, Nanjing University of Science & Technology, Nanjing, China e-mail:
[email protected]
also develop quickly especially in recent years. In the game of steganography and anti-steganography, the key of hiders is not only to design more secure algorithms, but also to choose more appropriate cover objects. Steganography using static medium such as multimedia files and electronic documents will inevitably leave the determinate object to make the detection more convenient. The dynamic medium maybe the better choice to be taken as covers. It becomes very hard to judge the start and the end of steganography and get the accurate detection results in dynamic stream. From the viewpoint, network traffic is more suitable for steganography. Network steganography is to exploit the redundancy in network protocols or network communication itself to transfer information covertly. There are three types of data hiding mechanisms: hiding data into the payload of packets, hiding data into the header fields, and encoding date into the packets’ transmission time. The first two types are the covert storage channel, the last type is the covert timing channel.The basic concepts about storage and timing channels can be illustrated as: a covert storage channel involves the direct/indirect writing of object values by the sender and the direct/indirect reading of the object values by the receiver, while timing channels involve the sender signaling information by modulating the use of resources over time such that the receiver can observe it and decode the information. From the viewpoint of communication form, network steganography can also be divided into two types: network tunnel and network covert channel. In the network tunnel the messages are usually hidden into the payload, the packets mimic the legitimate allowable communication traffic such as HTTP, FTP, SSH and so on, the covert communication is exactly the overt one. For the network covert channel, the covert traffic is hidden in the overt traffic, and the normal communication realized by the overt traffic persists in the
200
entire process of covert communication. To get more information about the network covert channel, the reader can refer to the survey [1]. In this paper, we focus on the network timing channel. To date, many methods constructing and detecting network timing channels have been proposed. Padlipsky proposed a timing channel by transmitting or not in each time interval [2]. It is a typical binary on/off timing channel with one rate denoting zero and the other rate is chosen arbitrarily. In [3], Yao et al. gave more detailed analysis and categorize the on/off timing channel into two types: deterministic channels and non-deterministic channels, and calculating the maximum transmission rate of a non-deterministic channel based on the packet delay distribution. Girling proposed a covert timing channel and suggested to choose proper inter-packet delay to mitigate the noise distortion [4]. But, these two kinds of methods can not consider the synchronization problem in the communication. In [5], Cabuk et al. used a special start sequence at the begin of each message fragment transmission to solve the synchronization problem from the technical viewpoint, and also point out that the phase-locked loop can be used to make synchronization in the communication process. Berk et al. introduced to use the inter-packet of consecutive packets to construct the packet-timing channel [6]. They analyzed the channels with two delays and several delays, and developed a mechanism to pick the optimal symbol distribution in multi-rate channels given the channel characteristics. There are also some indirect realizations using timing characteristics. Hintz described an indirect timing channel [7]. The sender sends a large number of requests to a server or stays silent in each time interval, equivalent to one bit per time interval. The receiver periodically probes the server and measures the response time to recover the covert information. Shah et al. [8] developed a keyboard device, called JitterBug, to create a loosely-coupled covert channel capable of leaking information typed on a key-board over the network. Because the traffic of normal network communication has their own rules, while the introduction of covet timing channel will destroy it. So from the point, one can make the covert channel detection. Cabuk et al. [5, 12] proposed the detection method by defining two measures to judge whether the traffic is the covert timing channel. One measure is regularity, the other is ε-similarity. Berk et al. [6] proposed methods for detecting inter-packet delay channels using two delays (binary channels) or multiple delays (multi-symbol channels). For binary channels the authors developed a simple detection method based on statistical analysis of the inter-arrival times. For covert channels the inter-arrival time histogram has two distinct spikes, and the mean inter-arrival time is between the spikes and has a very low packet count. For multi-symbol channels Berk et al. argued that a skilled
G. Liu et al.
covert sender would pick a delay distribution (symbol distribution) that maximizes the capacity. So the optimal distribution can be estimated and used to detect the presence of the covert channel. But in [2], Zander thought that for the multi-symbol channel, the detection method of Berk might be impractical. Recently, Gianvecchio [9] proposed an entropy-based detection method, The approach is based on the observation that the creation of a covert timing channel has certain effects on the entropy of the original process, and hence, a change in the entropy of a process provides a critical clue for covert timing channel detection, in Sect. 4, we will give a detailed discussion about it. According the above statement about covert timing channel detection, we can find that how to design more undetectable covet timing channel has become a challenge for designer. Gianvecchio [10] proposed a model-based covert timing channel, which exploits the statistical properties of legitimate network to evade detection in an effective manner. The work can be viewed as the first attempt to construct undetectable covet timing channel. Recently, Sellke [11] also proposed an undetectable covert timing channel. He used the Pareto distribution to describe the statistical characteristics and used a PRNG to mimic the Telnet distribution. In this paper, we proposed a simple and efficient binary covert timing channel based on Gianvecchio’s framework, and make analysis about the error bits ratio and evaluate the undetectability. The remainder of the paper is organized as follows. Section 2 introduces the model-based covert timing channel and the corresponding analysis. Section 3 gives our new algorithm. Section 4 makes the experiments about the detection resistance. Section 5 concludes the whole paper.
2 Review of model-based timing channel Because most of covert timing channels detection methods are based on statistics, the principle of [10] is to use the statistical distribution of normal traffic and perform encoding to fit it. In the section, we will review the model-based timing channel briefly and give the analysis about it. Overall, the framework of Gianvecchio’s method can be described as Fig. 1. The network traffic is filtered and analyzed in the beginning two boxes, then the message bits and the random number is input into the encoder module to generate the covert IP delays, then the cached packets are sent as those delays to produce the covert traffic. Because most of network traffics are all non-stationary, the framework processes the traffic fragment by fragment. The filter is used to pick the proper traffic by the network application and the source and destination IP addresses. For the filtered 100 packets, the analyzer computes the most suitable distribution model from exponential, gamma, Pareto,
Network covert timing channel with distribution matching
201
lognormal Poisson and Weibull by the maximum likelihood estimation. The encoder uses the model to generate the normal-like delays. The inputs of the encoder are the model type, model parameters, message bits, and random number. The outputs are the encoded delay times fitting for the model. Assuming the channel is N -symbol, the message needing to be encoded is s, the continuazation function is defined as (1) s Fcon (s) = (1) + r mod 1 = rs N Here r is the random number shared between the sender and receiver generated by the accessed packet domain such as TCP sequence number. The continuazation function convert the message into the corresponding random number rs belonging to [0, 1]. The corresponding discretization function is: Fdis (rs ) = N · ((rs − r) mod 1) = s˜
(2)
−1 The inverse distribution function Fmod takes a Uniform(0, 1) random number as input and generates a random variable from the selected model as output. The sequence of transformed random numbers rs1 , rs2 , . . . , rsn is input into the inverse distribution function to create random covert interpacket delays ds1 , ds2 , . . . , dsn . The total encoding function can be formulated as: −1 ds = Fenc (s) = Fmod ◦ Fcon (˜s )
(3)
The decoding is the inverse process of the encoding, which can be written as: s = Fdec (ds ) = Fdis ◦ Fmod (ds )
(4)
In [9], the authors gave the detailed analysis about the model-based approach including theoretical and empirical capacity and bit error rates results, and the undetectability Fig. 1 Framework of Model-based covert timing channel
Fig. 2 Covert timing channel encoding process
results under the shape testing and the regularity tests. From these results, the model-based method work soundly. But, we find despite that the author argued that the filter and the analyzer can work both on-line and off-line. On the on-line mode, the model type and parameters are generated based on the real-time datum; otherwise, those are given based on the recoded traffic datum. But, in fact, only on-line mode is suitable for the application to resist the statistical analysis. In on-line mode, for 100 packets, the model type and parameters is as large as about 40 ∼ 72 bits. Different from the embedded message bits, those important parameters bits should be keep not wrong with a single bit, because even one bit decoded error, the whole fragment message bits will become wrong. So we think it is a very key element that will restrict the application of the model-based covert timing channel method. Another element that will effect the influence the practical effect is the high bit error rate, according to the results provide in [10], when one packet delay carries 8 bits, the empirical bit error rate is as high as 40%, which is nearly equal to guess the received bits (50% under randomly guessing). So in this paper, we proposed a more simple and reliable covert timing channel with distribution matching characteristics.
3 The proposed method 3.1 Encoding and decoding The whole encoding process of our proposed method can be described as Fig. 2. Firstly, compress and encrypt the message bits for encoding. Secondly, filter the overt traffic to get the assigned packet stream for encoding. Thirdly, collect N packets from the assigned stream, and record the time delays.
202
G. Liu et al.
Fourthly, make the histogram of the recoded N −1 delays into L bins, obtain the center line by (5), and get the corresponding center value of each bin as c1 , c2 , . . . cα ∗ , . . . , cL . α L α ∗ = arg min H (i) − H (i) α i=1
(5)
i=α+1
Fifthly, encode the message bits into N − 1 time delays according to the below equation.
d=
⎧ ci , ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ cj , ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩
b = 0 and
∗ R · αk=1 H (k)
∈ [ ik=1 H (k), i+1 k=1 H (k)] b = 1 and
R· L k=α ∗ +1 H (k)
j
j +1 ∈ [ k=α ∗ +1 H (k), k=α ∗ +1 H (k)]
(6)
Here, R is a random number obey uniform distribution on [0, 1]. Equation (6) keep that the distribution of the regenerated time delays is the same as that of the overt traffic in the histogram with L bins meanings, which means the distribution matching. It should be noticed that the premise is the number of zeros and ones in message bits must be equal mostly. Because the message bits is compressed and encrypted beforehand, the premise is not hard to get satisfied. Finally, the transmitter sends the N packet by the encoded ds. When the decoder receive those N − 1 delays, he just needs to make the statistics to get the histogram H and center line β ∗ , when the received delay is less than β ∗ , zero is decoded, otherwise, one is decoded. 3.2 Analysis of bit error rates Because the designed covert timing channel is a binary one, so the bit error will only occur when the center lines moves such as Fig. 3 shows. There maybe several reasons to influence the position of the center line. Firstly, as we discussed above, the premise of right decoding is the balance of zeros and ones. So when the numbers of zeros and ones are not equal, it will cause other kind of center line shifting. Assume this kind of shifting is q. Secondly, the encoding equation (6) just can keep the distribution matching in the probability meanings. For the little samples, the empirical bias will also cause another center line shifting. Assume this kind of shifting is p. Thirdly, the center line shifting come from the noise caused by the computer processing, router transshipping and other effect from network and terminal. Assume this kind of shifting is w.
Fig. 3 Histogram of time delays
Fig. 4 Bit error rates under Gaussian noise
According to the above discussion, the total bit error rates can be computed as
max(α ∗ ,α ∗ +q+p+w) BER =
min(α ∗ ,α ∗ +q+p+w)
N −1
H (k) (7)
Supposing the distortion from different aspects can be modeled as a normal white noise with zero mean. Figure 4 gives the experimental result under the assumption. In our experiment, we use the traffic generated by the file uploading on bbs.njust.edu.cn. The terminal and server both locate in the campus network of the Nanjing University of Science and Technology. There are 200,000 HTTP packets are used in our experiments. From the figure, we can see that with the increase of the standard deviation, the bit error rates increase too. And for N = 100, L = 10 or N = 200, L = 40, the level of bit error rates is basically identical. It is to say that the bit error rates are only dependent on the noise strength. When the standard deviation is as high as 100 ms, the BER is only about 25%. It shows that the proposed method is robust to the channel noise. For guaranteeing the correctness of the covert timing channel communication, we suggest that the users choose some kind of error correcting codes (ECC) to encode the message bits.
Network covert timing channel with distribution matching
203
4 Detection resistance 4.1 Detection methods 4.1.1 Regularity measure The regularity detection method proposed in [9] is to determine whether the variance of the inter-packet delays is relatively constant or not. The traffic is separated into nonoverlapping windows with size W packets. For each window i, the standard deviation σi of delays is computed. For each two windows pair i, j (i < j ), the standard deviation of the pairwise differences are calculated as (8). The standard deviation is defined as the regularity of the traffic. |σi − σj | , i < j, ∀i, j (8) regularity = STDEV σi 4.1.2 Corrected conditional entropy measure The entropy detection method is based on the observation that the creation of a covert timing channel has certain effects on the entropy of the legitimate traffic. In fact, the entropy rate can also be looked as a measure of complexity or regularity of the network traffic. Because the exact entropy rate cannot be computed for finite delays, the entropy EN and conditional entropy CE are usually estimated from the empirical probability density functions, i.e. histogram with some chosen binning strategy. Gianvecchio proposed to use the corrected conditional entropy (CCE) to describe the regularity. The corrected conditional entropy is defined as (9). CCE(Xm |Xm−1 ) = CE(Xm |Xm−1 ) + perc(Xm ) · EN(X1 ), (9) where perc(Xm ) is the percentage of unique patterns (bin number) of length m and EN(X1 ) is the entropy with m fixed at 1 or the first-order entropy. The entropy rate is estimate to find the minimum of the corrected conditional entropy over different values of m. In this paper, the binning number is chosen as 5 according to [10]. 4.2 Experimental setup In almost all office network scenario, firewalls and IDSs are necessary to protect the network from malicious attacks or un-allowed communication, and HTTP connect is always allowable in the scenario. Moreover, in some networks, the proxy servers are used to provide HTTP applications. No matter what, HTTP packet is most popular allowable one in networks. So HTTP traffic is very suitable for being taken as cover traffic. In our experiment, we use the traffic generated by the file uploading on bbs.njust.edu.cn. The terminal
Fig. 5 Shape testing
and server both locate in the campus network of the Nanjing University of Science and Technology. For evaluating the detection resistance ability of our method, we collect seven datasets, including: Legitimate HTTP: 200,000 HTTP packets IPCTC: 200,000 HTTP packets DMCTC with N = 100, L = 14: 200,000 HTTP packets DMCTC with N = 100, L = 20: 200,000 HTTP packets DMCTC with N = 100, L = 30: 200,000 HTTP packets DMCTC with N = 200, L = 40: 200,000 HTTP packets DMCTC with N = 200, L = 60: 200,000 HTTP packets Here, IPCTC is abbreviation of IP covert timing channel, which transmits a 1-bit by sending a packet during an interval and transmits a 0-bit by not sending a packet during an interval. In our experiment, the interval is set to a fixed value 50 ms. DMCTC is our proposed method with N being fragment size and L being binning number. 4.3 Experimental results For evaluate the performance of our method, we perform several statistical tests. Figure 5 gives the mean histogram shape testing on 2000 packets. In the figure, no-filled bars are histogram of the legitimate HTTP, forward-diagonalhatched bars are that of covert traffic with N equal to 100, L equal to 20, backward-diagonal-hatched bars are covert traffic with N equal to 200, L equal to 40. We can see the proposed method can achieve better distribution matching. The network covert timing channel is performed on all 200,000 packets, the CCE is computed on each 2000 packets, Fig. 6 is the detection results of mean value of CCE. It can be seen that the entropy value of our proposed algorithm is close to that of the legitimate HTTP traffic. For comparison, the experimental result of IPCTC proposed is also given. It can be seen that CCE of IPCTC is very low comparing with that of the legitimate HTTP. In regularity testing, we compute 100 times for 2000 packets under three window sizes 50, 100, and 200. The
204
G. Liu et al.
should be mentioned, our method is just binary covert channel, the transmission efficiency is low. In the same framework, to extend binary coding to multiple coding and keep the undetectability is our future work. Acknowledgement The work was supported by the China Post Doctor Foundation of China (Grant No. 20070421017), NSF of Jiangsu Province(Grand No. BK2008403) and Post Doctor Foundation of Jiangsu Province(Grant No. 0801044B).
References Fig. 6 Entropy detection results Table 1 Regularity test results Traffic
Means of regularity W = 50
W = 100
W = 200
54.29
45.66
31.50
N = 100, L = 14
57.27
28.56
25.67
N = 100, L = 20
54.53
27.95
25.94
N = 100, L = 30
48.61
30.48
27.39
N = 200, L = 40
51.90
25.32
23.43
N = 200, L = 60
46.36
24.83
22.22
Legitimate HTTP DMCTC
IPCTC
0.14
0.062
0.047
means of regularity are listed in Table 1. If the regularity is small, the traffic is highly regular, indicating a potential covert timing channel. From Table 1, we can see that the covert traffic generated by our proposed algorithm has higher regularity value, while that of IPCTC is very low comparing with the legitimate HTTP traffic.
5 Conclusions How to design an undetectable, reliable and high-capacity covert timing channel is very interesting question. In this paper, we design a reliable and undetectable covert timing channel with distribution matching. The approach processes the traffic as fixed-length fragment and obtains the histogram of the delays, then use the binary coding method to embed the message bits. Experimental results show that our method has lower bit error rates and are undetectable against the distribution shape testing, regularity-based detection method and entropy-based detection method. But, it
1. Zander, S., Armitage, G., & Branch, P. (2007). Covert channels and countermeasures in computer network protocols. IEEE Communications Magazine, 45(12), 136–142. 2. Padlipsky, M. A., Snow, D. W., & Karger, P. A. (1978). Limitations of end-to-end encryption in secure computer networks. Tech. Rep. ESD-TR-78-158, Mitre Corporation. http://stinet.dtic.mil/ cgi-bin/GetTRDoc?AD=A059221&Location=U2&doc= GetTRDoc.pdf. Accessed 1 November 2009. 3. Yao, L. H., Zi, X. C., Pan, L., & Li, J. H. (2009). A study of on/off timing channel based on packet delay distribution. Computer Security. doi:10.1016/j.cose.2009.05.006. 4. Giffin, J., Greenstadt, R., Litwack, P., & Tibbetts, R. (2003). Covert messaging through TCP timestamps. In Lecture notes in computer science: Vol. 2482. Proceedings of privacy enhancing technologies workshop (pp. 194–208). Berlin: Springer. 5. Cabuk, S., Brodley, C. E., & Shields, C. (2004). IP covert timing channels: design and detection. In Proceedings of 11th ACM conf. computer and communications security (pp. 178–87). 6. Berk, V., Giani, A., & Cybenko, G. (2005). Detection of covert channel encoding in network packet delays. Tech. Rep. TR2005536, Department of Computer Science, Dartmouth College. http://www.ists.dartmouth.edu/library/149.pdf. Accessed 1 November 2009. 7. Hintz, A. (2003). Covert channels in TCP and IP headers. http://www.defcon.org/images/defcon-10/dc-10-presentations/ dc10-hintz-covert.ppt. Accessed 1 November 2009. 8. Shah, G., Molina, A., & Blaze, M. (2006). Keyboards and covert channels. In Proceedings of the 15th USENIX security symposium (p. 5). 9. Gianvecchio, S., & Wang, H. (2007). Detecting covert timing channels: an entropy-based approach. In Proceedings of the 14th ACM conference on computer and communications security (pp. 307–316). 10. Gianvecchio, S., Wang, H., Wijesekera, D., & Jajodia, S. (2008). Model-based covert timing channels: automated modeling and evasion. In Proceedings of recent advances in intrusion detection (RAID) symposium (pp. 211–230). 11. Sellke, S. H., Wang, C. C., Bagchi, S., & Shroff, N. (2009). Covert TCP/IP timing channels: theory to implementation. In Proceedings of the 28th conference on computer communications (INFOCOM). http://www.stat.purdue.edu/~ssellke/publications/ covertTC.pdf. Accessed 1 November 2009. 12. Cabuk, S., Erodley, C. E., & Shields, C. (2009). IP covert channel detection. ACM Transactions on Information and System Security, 12(4), 1–29.
Network covert timing channel with distribution matching Guangjie Liu received the B.E. and M.E. from Nanjing University of Science and Technology in 1998, 2002 respectively. He now is a Lecture with Automation School of Nanjing University of Science and Technology. He has contributed more than 40 refereed papers covering topics of network and multimedia steganography.
Jiangtao Zhai received the M.E. from Nanjing University of Science and Technology in 2008. He now is candidate for PHD. with Nanjing University of Science and Technology. His research field is the network security including network steganography, covert channel detection and elimination.
205 Yuewei Dai received the M.E., and Dr. Eng. degrees from Nanjing University of Science and Technology in 1987 and 2002 respectively. He now is a professor with Automation School of Nanjing University of Science and Technology. He has contributed more than 100 refereed papers including automation control, information technology and information security. He is also the director of System Engineering Academy of Jiangsu Province in China.