J Supercomput DOI 10.1007/s11227-014-1255-1
Modern network traffic modeling based on binomial multiplicative cascades Jeferson Wilian de Godoy Stênico · Lee Luan Ling
© Springer Science+Business Media New York 2014
Abstract In this paper we present a new multifractal approach for modern network traffic modeling. The proposed method is based on a novel construction scheme of conservative multiplicative cascades. We show that the proposed model can faithfully capture some main characteristics (scaling function and moment factor) of multifractal processes. For this new network traffic model, we also explicitly derive analytical expressions for the mean and variance of the corresponding network traffic process and show that its autocorrelation function exhibits long-range dependent characteristics. Finally, we evaluate the performance of our model by testing both real wired and wireless traffic traces, comparing the obtained results with those provided by other well-known traffic models reported in the literature. We found that the proposed model is simple and capable of accurately representing network traffic traces with multifractal characteristics. Keywords Multiplicative cascades · Multifractal processes · Network traffic modeling 1 Introduction A robust and efficient traffic model should be able to capture most important features of nowadays network traffic flows, i.e., it is desirable that the model has the ability to accurately characterize the most distinctive behaviors existing in real traffic traces,
J. W. de Godoy Stênico (B) · L. L. Ling School of Electrical and Computer Engineering, State University of Campinas-Unicamp, Av. Albert Einstein 400, Campinas, SP, Brazil e-mail:
[email protected] L. L. Ling e-mail:
[email protected]
123
J. W. de Godoy Stênico, L. L. Ling
being easy to be understood and applied to study the effects of the model parameters in terms of network performance. The finding of self-similar properties of network traffic [1], which cannot be adequately described by classical models such as Markov and Poisson models, was immediately followed by the development of fractal models of traffic. The fractional Brownian motion—fBm and its increment process, called fractional Gaussian noise—fGn, are the stochastic processes that have been successfully and widely used to model the self-similar or monofractal traffic. In spite of the popularity, this self-similar and monofractal model has failed to faithfully describe the traffic flows that carry multiscaling or multifractal characteristics. Some more elaborated traffic models have been proposed, mostly based on the multifractal analysis. Among the existing multifractal models, the most representative ones are the following. The Multifractal Wavelet model (MWM) is a model based on wavelet transform using the Haar wavelet [2]. The Adaptive Wavelet-Based Multifractal Model (AWMM) is one that simplifies the synthesizing process of the MWM model, by incorporating some additional parameter information that describes the corresponding scaling function and moment factor of the multifractal process [3]. The Variable Variance Gaussian Multiplier Model (VVGM) is a modeling version that takes into account the Gaussian nature of the multiplier distributions used to generate binomial multiplicative cascades [4]. The Variable-Scale parameter Cauchy Multiplier (VSCM), a multiplicative cascade model structurally similar to VVGM, replaces Gaussian-based multipliers by those with Cauchy distribution [5]. The multifractional Brownian motion (mBm) [6] is a traffic model that generalizes the definition of fractional Brownian motion with the scaling exponent H no longer constant but time dependent [7]. In this paper, we present a new construction scheme for multiplicative cascades, to achieve new network traffic models capable of capturing most major multifractal properties represented by the scaling function and the moment factor estimated from the corresponding real network traffic traces. For robust and realistic stochastic traffic modeling, multiplicative cascades have to preserve both random and mass conservative structural natures. The paper is organized as follows. In Sect. 2, we present a brief description of the construction procedure of the classic binomial multiplicative cascade well known in the literature, and then the new binomial multiplicative cascade. In Sect. 3 we outline the framework of our multifractal traffic model. Section 4 is dedicated to the derivation of the statistical parameters of the multifractal traffic model based on the proposed conservative, multiplicative binomial cascade modeling. More precisely, for a given traffic process, we derive the analytical expressions for its mean, variance and autocorrelation function. In Sect. 5 we define a global scaling parameter similar to the Hurst parameter of a monofractal process. Section 6 is devoted to the proposed model validation and experimental investigations. Using extracted statistical values from real network traffic traces, we build and evaluate the new traffic modeling approach by comparing its performance with that given by the classic binomial cascade models [4,5]. We also make use of the proposed traffic model to generate synthetic traffic traces for the purpose of model validation by simulation. Finally in Sect. 7, we present our conclusions.
123
Modern network traffic modeling
2 Multiplicative cascades Definition 1 A multiplicative cascade is an iterative process that fragments a given set into smaller and smaller pieces according to some geometric rule and, at the same time, distributes the total mass of given set according to another rule. Multiplicative cascade models are mathematical constructs appropriate to capture the intermittent and highly irregular behavior. The multiplicative cascades were initially proposed by Kolmogorov for turbulence modeling [8]. Currently the multiplicative cascade model has found applications in several areas that need to describe non-linear phenomena which have multiplicative structure, such as traffic modeling [3–5], stock exchange rates [9], geophysical phenomena [10], DNA evolution [11], and others. 2.1 Classic binomial conservative multiplicative cascade The binomial cascade here presented is the simplest method of obtaining a multifractal process, consisting of an iterative procedure in the compact interval [0,1]. It is assumed that the initial measure is preserved at all stages. Without losing the generality, we assume the unit initial measure. Let m 0 = r and m 1 = 1 − r , two multipliers for cascade generation, possibly with random r . At stage n = 0 of the cascade iteration, we have the unit measure denoted by μ0 uniformly distributed on interval I0 = [0, 1]. At stage n = 1, this unit measure is divided into two parts, with measures m 0 and m 1 , and assigned to two subintervals, I00 = [0, 1/2] and I01 = [1/2, 1], respectively. At the next stage (n = 2), this process is repeated on both the set I00 and I01 to generate four even smaller subintervals I000 , I001 , I010 and I011 , each one allocated with, respectively, the following measures: μ[0, 1/4] = m 0 m 0 μ[1/4, 1/2] = m 0 m 1 μ[1/2, 3/4] = m 1 m 0 μ[3/4, 1] = m 1 m 1 .
(1)
Through this iterative process, at stage k, there are in total 2k disjoint dyadic subintervals of type [t, t + 2−k ] with their corresponding measures μ s. Let ϕ0 and ϕ1 denote the relative frequencies of 0 s and 1 s, respectively, in the cascade development. The measure μ in the dyadic interval [t, t + 2−k ] is given by: μ[t, t + 2−k ] = μ[k ] = m 0 0 m 1 1 . kϕ
kϕ
(2)
Notice that this cascade construction procedure preserves the total mass of each dyadic interval, therefore resulting in so-called conservative cascades [12] (Fig. 1). 2.2 The proposed binomial conservative cascade The iterative procedure for the construction of the new binomial conservative multiplicative cascade is based on the multipliers determined by the Newton Binomial expression.
123
J. W. de Godoy Stênico, L. L. Ling
Fig. 1 Classic conservative binomial cascade construction
The cascade construction process is as follows. Let I denote the unit interval [0, 1]. Let x be a real valued random variable with uniform distribution on the interval [0, 1]. At the N th stage the interval I is divided into 2 N subintervals each holding a measure proportional to a numerical quantity determined by the following Newton binomial expression:
2N k
(x)2
N −k
(1 − x)k .
(3)
In other words, at stage N, we apply the following weighting factor for the first subinterval: N
W00 . . . 0 = (x)2 + (1 − x)2
N
(4)
N digits
while for the remained subintervals the weighting factors are: Wb1 b2 ...b N =
2N i
(x)2
N −i
(1 − x)i |i=1,...,2 N −1 .
(5)
where b1 b2 . . . b N is the binary representation of integer number i. For instance, the measure allotted to the first interval [0,1/2] of the stage N = 1 is: W0 = μ1 [0, 1/2] = (x0 )2 + (1 − x0 )2 while the measure for the second interval is: 2 W1 = μ1 [1/2, 1] = (x0 )(1 − x0 ). 1
123
(6)
(7)
Modern network traffic modeling
For N = 2, the second stage of the cascade iteration, we have the following rules. The measure allotted to the first interval is equal to: μ00 [0, 1/4] = μ1 [0, 1/2]W00 2
2
= μ1 [0, 1/2]((x1 )2 + (1 − x1 )2 ).
(8)
The measure for the second, third and fourth intervals are, respectively, μ01 [1/4, 1/2] = μ1 [0, 1/2]W01 2 2 2 (x1 )2 −1 (1 − x1 )1 , = μ1 [0, 1/2] 1 μ10 [1/2, 3/4] = μ1 [1/2, 1]W10 2 2 2 (x1 )2 −2 (1 − x1 )2 = μ1 [1/2, 1] 2
(9)
(10)
and μ10 [3/4, 1] = μ1 [1/2, 1]W11 2 2 22 −3 3 (x1 ) = μ1 [1/2, 1] (1 − x1 ) . 3
(11)
This iterative process is repeated and interesting enough, the total measure at each stage is preserved in expectation. Considering the kth stage of the cascade, the mass measure of each subinterval of the kth stage is derived from weighing the mass measure of the subintervals of the previous (k − 1) stage. In other words, the measure of first interval Ik is equal to: μ[0, 2−k ] = μ[0, 2−k+1 ]W00 . . . 0 k digits
= μ[0, 2
−k+1
k
k
][(xk−1 )2 + (1 − xk−1 )2 ].
(12)
For the other intervals of the kth stage, we have: μ[Ik ] = μ[Ik−1 ]Wb1 b2 ...bk k k 2 (xk−1 )2 −i (1 − xk−1 )i |i=1,...,2k −1 = μ[Ik−1 ] i
(13)
where b1 b2 . . . bk is the binary representation of the integer number i. Notice that random variables x0 x1 , x2 , x 3 , . . . are i.i.d. with uniform distribution on [0, 1]. k bk 2−i , let R(b1 . . . bk ) = Wb1 b2 ...bk , representing For t = 0.b1 b2 . . . bk = i=1 the new incorporated random multiplier member for the subinterval [t, t + tk ] at the kth stage. Therefore the measure on the [t, t + tk ] interval can be expressed as:
123
J. W. de Godoy Stênico, L. L. Ling
Stage 0
2.5
Stage 1
1 2 0.8 1.5 0.6 1
0.4
0.5
0.2 0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
1
0
Stage 2
3
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
1
Stage 3 4 3.5
2.5
3 2
2.5
1.5
2 1.5
1
1 0.5 0
0.5 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
1
Stage 4
6
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
1
Stage 5 7 6
5
5
4
4 3 3 2
2
1 0
1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
1
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
1
Fig. 2 Stages 0–5 of a conservative cascade construction
μ(tk ) = μ[t, t + tk ] = R(b1 )R(b1 b2 ) . . . R(b1 . . . bk ).
(14)
For illustration purpose, in Fig. 2 we represent a construction pattern sample of a new random conservative binomial cascade with its five initial stages proposed in this work.
123
Modern network traffic modeling
3 A multifractal traffic model Definition 2 A stochastic process X (t) is called multifractal if it has stationary increments and satisfies: E(|X (t)|q ) = c(q)t τ (q)+1 = c(q)t τ0 (q)
(15)
for some positive values q ∈ Q, [0, 1] ⊆ Q, τ (q) (scaling function) and c (q) (moment factor) are functions on domain Q and are independent of t. The function τ (q), also known as the partition function, is concave with τ (0) = −1 [13]. The proposed multifractal model aims to capture easily the scaling function and the moment factor. According to the arguments given in [3] and [14], a multifractal process characterized by these two parameter functions can be achieved by the product of a cascade and i.i.d. positive random variable Y . More specifically, a multifractal traffic process model X (t) can be interpreted as the product of the peak rate of the flow Y and the measure of burstiness μ(t N ) at the modeled time scale t N . The variable Y is chosen to be independent of the cascade measure μ(t N ). The choice is directly of conservative cascade properties [12]. The measure μ(t N ) has a small value since it can be viewed as the product of N multipliers [see Eq. (14)], however for the conservative cascades the measure does not reduce to Eq. (14), but instead takes the form: μ(tk ) = Y (b1 . . . bk )R(b1 )R(b1 b2 ) . . . R(b1 . . . bk ).
(16)
Therefore, Y is the fixed point of the operation of randomly weighted averaging using as weights the random quantities. Successive R and Y are independent, and all the R (or Y ) are identically distributed. Thus, the obtained series, denoted by X (t N ), satisfies: τ (q)
E((X t N )q ) = E(Y q )E(μ(t N )q ) = E(Y q )t N0
(17)
Interpreting Eq. (17) through the definition of multifractal processes given by Eq. (15), it can be shown that R and Y should be related with τ (q) and c (q) as the following:
−log2 (E(R)q ) = τ0 (q)
(18)
E(Y q ) = c(q) and Eq. (17) becomes: E(X (t0 )q ) = E(Y q )2 N [q+log2 E(R
q )]
−log2 E(R q )
t0
(19)
where t0 denotes the unit time interval of the data traffic to be modeled. The variables R and Y must now be chosen so as to satisfy the following system of equations:
123
J. W. de Godoy Stênico, L. L. Ling
−log2 (E(R)q ) = τ0 (q) logE(Y q ) = logc(q) − (q + log2 (E(R q ))N log2)
(20)
The scaling function τ (q) can be accurately modeled by assuming that R is a random variable on [0,1] with a beta distribution, Beta(α, β). Thus, the function τ0 (q) := τ (q) + 1 related to the scaling function τ (q) can be written as: τ0 (q) = log2
(α + β)(α + q) (α)(α + β + q)
(21)
where (.) denotes the Gamma function. In [15] conjectured and in [16] proved that the random variable Y has a lognormal distribution defined by its two parameters, m and v. Therefore, the qth moment of Y 2 2 is given by E(Y q ) = emq+v q /2 . For the equation system specified by (20), we have: mq + v 2 q 2 /2 = logc(q) − (q + log2 (E(R q ))N log2).
(22)
Analytically Eq. (22) allows us to express the moment factor as:
c(q) = e
mq+v 2 q 2 /2 N
2
log2 (α+β)(α+q) (α)(α+β+q)
.
(23)
Analyzing Eqs. (21) and (23), one can notice that the proposed multifractal model is fully characterized by a set of four parameters (α, β, m, v). 4 Statistical properties Statistical properties of multiplicative cascades have been widely studied in a number of papers [17,18]. In this work, we establish some of those statistical properties in function of a set of 2 N synthetic samples under our cascade multifractal modeling type by assuming: X (t0 ) = 2n .Y.R(b1 )R(b1 b2 ) . . . R(b1 . . . bk ) for N 1
(24)
Lemma 1 The mean of process X(t0 ) is given by: E[X (t0 )] = E[Y ] = em+v
2 /2
(25)
where t0 denoting the unit time interval of traces to be modeled. Proof see Appendix A. Lemma 2 The variance of the model is given by: var[X (t 0 )] = E[Y 2 ]22N E[R 2 ] N − E[X(t 0 )]2 (α + β)(α + β + 1) N 2 2 = e2m+2v . 22N . − e2m+v (α + 1)α
123
(26)
Modern network traffic modeling
Proof see Appendix B. Lemma 3 X (t0 ) has a long-range dependent correlation structure. The covariance cov[X (t0 )n , X (t0 )n+k ] can be estimated as follows. cov[X (t0 )n , X (t0 )n+k ]
= E(Y )2 22N E[μ(t N )n .μ (t N )n+k − 1
(27)
where k = 2 p , p = 1, 2, 3, . . .. The two measures μ(t N )n and μ(t N )n+k have the same origin atstage N − p − N 1, denoted by μ(t N − p−1 ), thus μ(t N )n = μ(t N ) N − p−1.r N − p i=N − p+1 ri, j1 N and μ(t N )n = μ(t N ) N − p−1. (1 − r N − p ) i=N − p+1 ri, j2 where ri, j represents the actual multiplier values at stage i. Then E[μ(t N )n .μ(t N )n+k ] = E μ(t N − p−1 )2 ]E[r N − p (1 − r N − p ) ⎤ ⎡ N ri, j1 , ri, j2 ⎦ = E(R 2 ) N − p−1 [1/2 − E(R 2 )](1/2)2 p E⎣ i=N − p+1
Inserting the last expression into Eq. (27) we get: cov = e
2m+v 2
22 α(α + 1) (α + β)(α + β + 1)
N − p
(α + β)(α + β + 1) 2 − 1 − e2m+v 2α(α + 1) (28)
5 Global parameter HEG An alternative way to characterize a multifractal process is through its local Hölder exponent [19]. However, for simplicity, the Hurst exponent has been frequently used in queuing analysis, characterizing the global scaling behavior of traffic models even for multifractal traffic arrivals. In this subsection we derive the so-called Global Parameter (HEG ), a global scaling parameter for the proposed multifractal traffic model, based on the underlying theory for self-similar processes [20]. We intend to use this global scaling parameter in future for effective bandwidth estimation and dynamic bandwidth allocation. Let X (t) be a self-similar process with Hurst parameter H , with zero mean and variance σ 2 so that the following relation is valid: x(t) = a −H X (at). d
(29)
Consider the corresponding increment process of X (t), Y (t) = X (t) − X (t − 1). The long-range dependence (LRD) of Y (t) can be observed through the analysis of its
123
J. W. de Godoy Stênico, L. L. Ling
covariance. Notice that the aggregate Y m = n1 km (k−1)m+1 Y (k) of the process Y (t) is also preserve the same self-similar property, that is: Y m = m H −1 Y (1). d
(30)
The aggregate process Y m has zero mean and its variance can be written in function of the aggregation parameter m. Expressing in logarithm, we have: log{var [Y m ]} = (2H − 2)log{m} + log{σ 2 }
(31)
Using Eq. (31), we define a global scaling parameter (HEG ) for multifractal processes. The derivation of HEG is organized in form of a lemma and a proposition as follows. Lemma 4 Let X (k) be a multifractal process with parameters (α, β, m, v). The variance V ar [X m ] of the X (k) aggregate process for m = 2k can be determined by the following equation: var[X m ] = e2m+2v
2
α(α + 1) (α + β)(α + β + 1)
N −k
2
− e2m+v 22k−2N
(32)
Proof see Appendix C. Proposition 1 Let X (k) be a multifractal process with the parameter vector (α, β, m, v). The global parameter is given by: 1 H E G = 1 − log2 2
α(α + 1) (α + β)(α + β + 1)
(33)
Proof Lemma 4 describes the behavior of the variance Var[X m ] of the X (k) aggregate process in function of aggregation parameter m. When the stage number N in the 2 generation of the proposed cascade becomes large, the last term (e2m+v 22k−2N ) of Eq. (32) tends to zero and can be ignored. Figure 3 shows the behavior of the ratio 2
e2m+2v
e2m+v 22k−2N
N −k
2
α(α+1) (α+β)(α+β+1)
by a plotted curve in function of N , using a real traffic trace, 2
lbl_tcp_3 [21]. Notice that term e2m+v 22k−2N becomes negligible for large N . Thus applying a logarithm operator on both sides of Eq. (32) we have:
α(α + 1) log2 {Var[X ]} = log2 {e } + log2 (α + β)(α + β − 1) −log2 m α(α + 1) + log2 . (α + β)(α + β + 1) m
123
2m+2v 2
N
(34)
Modern network traffic modeling 0
10
-20
10
-40
10
-60
10
-80
10
0
1
10
2
10
10
N 2
Fig. 3
e2m+v 2k−2N for different values of N 2 α(α+1) e2m+2v ( (α+β)(α+β+1) ) N −k
Comparing (34) with (31), we find that there is a correspondence between the following two terms. log{m}(2H − 2) ∼ −log{m}log2
α(α + 1) . (α + β)(α + β + 1)
(35)
Therefore HEG
1 H = 1 − log2 2
α(α + 1) . (α + β)(α + β + 1)
(36)
As a result, we can define a global parameter H E G for multifractal traffic processes, similar to the Hurst parameter H in the case of monofractals. 6 Experimental investigation and analysis In this section, we examine the efficiency of the proposed model in generation of synthetic multifractal traffic traces. The model’s parameters are calculated in terms of the estimated parameters from real traffic traces. For illustration, arbitrarily we chose the wireless traffic trace collected during the 62nd Internet Engineering Task Force (IETF) meeting [22]. Namely, we call this traffic data “Traf_IETF”, and a TCP/IP traffic trace called “lbl_tcp_3” collected in [21]. 6.1 Analysis of synthetic traffic Our first step of model validation consists of comparison of two traffic statistic parameters, the mean and variance values. Table 1 compares the means and variances values
123
J. W. de Godoy Stênico, L. L. Ling Table 1 Mean, variance (Traf_IETF)
Table 2 Mean, variance (lbl_tcp_3)
Traffic trace
Mean
Variance
Wireless Traf_IETF
264.9810
1.5109 × 105
MWM [2]
248.5598
1.3056 × 105
VVGM [4]
244.7312
1.2481 × 105
VSCM [5]
250.5498
1.2872 × 105
The proposed model
256.4823
1.3884 × 105
Traffic trace
Mean
Variance
Wireless Traf_IETF
136.3555
5.7062 × 104
MWM [2]
136.0653
6.2675 × 104
VVGM [4]
136.1041
6.0128 × 104
VSCM [5]
136.1205
6.2983 × 104
The proposed model
136.2365
5.7721 × 104
estimated from five trace segments: the real Traf_IETF traffic trace, synthetically generated traffic traces using the classics multifractal MWM, VVGM and VSCM methods and a synthetic traffic trace based on the proposed method. Each traffic segment has in total 524,288 samples. Analogously, Table 2 compares the mean and variance for the traffic trace lbl_tcp_3 and their respective synthetic traffic trace. Each traffic segment in total 894,997 samples. Evidently the statistical values estimated from the synthetic trace generated by the proposed model in both experiments are closer to the true ones than those provided by the MWM, VVGM and VSCM approaches.
6.2 Scaling function and moment factor As mentioned before, there are two classical approaches to validate the characteristics of multifractal processes, or more specifically, the characteristics of multifractally modeled traffic traces: (a) estimation and validation of the scaling function and the moment factor of the process, and (b) estimation and validation of the corresponding multifractal spectrum. Figures 4 and 5 show, respectively, the plotted scaling functions [Eq. (21)] and the moment factors [Eq. (23)] estimated from the Traf_IETF traffic trace and the synthetic versions based on the proposed modeling approach and the MWM, VVGM and VSCM methods. Table 3 shows the average percentage change of these two multifractal parameters. Similarly, Figs. 6 and 7 present these multifractal parameter curves for the lbl_tcp_3 traffic trace and their synthetic versions, and Table 4 shows the average percent-
123
Modern network traffic modeling
2 1.5
τ (q)
1 Traf_IETF
0.5
Proposed MWM
0
VVGM VSCM
-0.5 -1 0
2
4
6
8
10
12
q
Fig. 4 Scaling function (Traf_IETF)
60 50
log[c(q)]
40 30
Traf_IETF Proposed
20
MWM VVGM
10
VSCM
0 0
2
4
6
8
10
12
q
Fig. 5 Moment factor (Traf_IETF)
Table 3 Paramenter percentage average change (Traf_IETF)
Models
Scaling function τ (q) %
Moment factor c(q) %
Proposed
4.06
1.88
MWM [2]
10.27
13.51
VVGM [4]
13.70
6.77
VSCM [5]
7.88
8.12
age change of these two multifractal parameters referenced to the lbl_tcp_3 traffic trace. As illustrated by Figs. 4 and 6, the multifractal characteristics of the traces are revealed by the non-linearity of the scaling functions τ (q).
123
J. W. de Godoy Stênico, L. L. Ling 3 2.5 2
τ (q)
1.5 lbl_tcp_3
1 Proposed
0.5 MWM
0
VVGM
-0.5 -1 0
VSCM
4
2
6
q
8
12
10
Fig. 6 Scaling function (lbl_tcp_3) 70 60
log[c(q)]
50 40 lbl_tcp_3
30
Proposed
20
VVGM VSCM
10
MWM
0 0
2
4
6
8
10
12
q
Fig. 7 Moment factor (lbl_tcp_3) Table 4 Parameter percentage average change (lbl_tcp_3)
Models
Scaling function τ (q) %
Moment factor c(q) %
Proposed
1.07
1.94
MWM [2]
39.03
8.43
VVGM [4]
20.99
5.14
VSCM [5]
7.52
4.09
6.3 Multifractal spectrum The multifractal spectrum is a one-dimensional concave profiled curve representing the amount of process points for a given singularity level or Hölder exponent value [19]. There are several ways of obtaining the multifractal spectrum of a signal. The
123
Modern network traffic modeling
1
Traf_IETF
Proposed
0.8
MWM
f(ξ )
VVGM
0.6 VSCM
0.4
0.2
0
0
0.2
0.4
0.6
0.8
1
1.2
ξ
Fig. 8 Multifractal Legendre spectra (Traf_IETF)
1
lbl_tcp_3 Proposed
0.8
f(ξ )
MWM VVGM
0.6
VSCM
0.4
0.2
0
0
0.2
0.4
0.6
0.8
1
1.2
ξ
Fig. 9 Multifractal Legendre spectra (lbl_tcp_3)
most popular ones include the large deviations spectrum, the Hausdorff spectrum and the Legendre spectrum [23]. Basically all three provide the information of singularity levels and distribution; however, it is worth mentioning that for some particular signal types the large deviations spectrum can be a little for informative than the Legendre spectrum [24]. Figures 8, 10 and 12 show respectively 5 Legendre, Hausdorff and large deviation spectral curves derived from so far used 5 distinct traffic traces (the real network traffic trace, the synthetic ones by the MWM, VVGM, VSCM and proposed models) based on the Traf_IETF while Figs. 9, 11 and 13 are corresponding spectral curves based on lbl_tcp_3 traffic trace. Observe that the concave profiles of the series in these figures
123
J. W. de Godoy Stênico, L. L. Ling
1 Traf_IETF Proposed
0.8
MWM VVGM
f( ξ )
0.6
VSCM
0.4
0.2
0 0
0.2
0.4
0.6
0.8
1
1.2
ξ
Fig. 10 Multifractal Hausdoff spectra (Traf_IETF)
1
lbl_tcp_3
Proposed
0.8
f( ξ )
MWM
0.6
VVGM
VSCM
0.4
0.2
0 0
0.2
0.4
0.6
0.8
1
1.2
ξ
Fig. 11 Multifractal Hausdoff spectra (lbl_tcp_3)
confirm the presence of multifractal characteristics in those traffic processes. Most important, these figures confirm the superiority of the proposed model in the sense of the spectral curves derived from the proposed approach is much closer to those derived from using the real traffic traces than other modeling methods. Notice that for low Holder exponent values (ξ <0.4) the obtained multifractal spectral curves are closely localized while for large Holder exponent values (ξ >0.4) these curves are much more dispersed. It is particularly interesting to mention that low Holder exponent values (ξ <0.4) are responsible for the burstiness encountered in a process, therefore the degree of traffic burstiness.
123
Modern network traffic modeling
1
Traf_IETF Proposed
0.8 MWM VVGM
f( ξ )
0.6 VSCM
0.4
0.2
0 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
ξ
Fig. 12 Multifractal large deviation spectra (Traf_IETF)
lbl_tcp_3
1
Proposed
0.8
MWM VVGM
0.6
f( ξ )
VSCM
0.4
0.2
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
ξ
Fig. 13 Multifractal large deviation spectra (lbl_tcp_3)
6.4 Autocorrelation function The autocorrelation function of a process can indicate the presence or not of longrange dependence (LRD) in a time process. Moreover, the autocorrelation function reflects the second-order statistic characteristics of time series. Figures 14 and 15 plot the autocorrelation function curves for the two investigated traffic traces. The autocorrelation function curve shown in Figs. 14 and 15 indicates that Eq. (28) demonstrates long-range dependencies in the time series.
123
J. W. de Godoy Stênico, L. L. Ling 1
Autocorrelation Function
Traf_IETF
0.8 Proposed MWM
0.6
VVGM
0.4
VSCM
0.2
0 0
5
10
15
20
25
30
35
40
45
50
k
Fig. 14 Autocorrelation functions (Traf_IETF)
1
Autocorrelation Function
lbl_tcp_3
0.8 Proposed MWM
0.6
VVGM
0.4
VSCM
0.2
0
0
5
10
15
20
25
30
35
40
45
50
k
Fig. 15 Autocorrelation functions (lbl_tcp_3)
For example, the decay rate of the statistical dependencies between samples is slower than an exponential decay, which indicates that the analyzed series have longrange dependencies.
6.5 Multiscale diagram Another important function in multifractal analysis is the partition function based on wavelets. Instead of calculating the exponents of regularity local, statistics of local
123
Modern network traffic modeling Linear Multiscale Diagram: h q =ζ q / q
Multiscale Diagram: (j ,j ) = (1, 8) 1 2
0
-0.12 -0.14
-0.5
-0.16 -0.18
-1
ζq
hq
-0.2 -0.22
-1.5
-0.24 -2
-0.26 -0.28
-2.5
-0.3 0
2
4
6
8
10 12
2
q
4
6
8
10
12
q
Fig. 16 Multiscale diagram (Traf_IETF)
behavior of a process can be obtained by its partition function based on wavelet S j (q). To do this we have the following definition. Definition 3 The partition function based on wavelet is given by [25]: S j (q) = E|W j,k |q .
(37)
For certain multifractal processes the partition function scales asymptotically j → ∞ as: log2 S j (q) ∼ q.cte + jαq
(38)
Thus, the slope of log 2 S j (q) in function j gives us an estimate for αq . For monofractais processes (self-similar) αq varies linearly with respect to q, already multifractal processes in this variation are non-linear. To verify this property we use a tool available at [26], known as Multiscale Diagram that shows whether a process has a tendency to present monofractal or multifractal behavior in an interval [ j1 , j2 ]. Conducting this test, two parameters were defined [27]: ςq = αq − q/2 and h q = ς/q. Multiscale Diagram (MD) on the vertical axis shows the exponents ςq on the horizontal axis and values of q. Linear Multiscale Diagram (LMD) on the vertical axis shows the exponents h q on the horizontal axis and values of q. If a process is monofractal, the diagram ςq results in a straight line and h q a constant. In the case of a multifractal process, the diagram ςq results in a curve and h q is no-constant [28].
123
J. W. de Godoy Stênico, L. L. Ling Linear Multiscale Diagram: hq = ζ q / q
Multiscale Diagram: (j1 ,j 2 ) = (1, 8)
0 0 -0.5 -0.05
ζq
-1
hq
-0.1
-1.5 -0.15
-2
-0.2
0
2
4
6
8
10 12
-0.25
2
4
6
8
10
12
q
q Fig. 17 Multiscale diagram (lbl_tcp_3)
Multiscale diagrams for the wireless network traffic trace (Traf_IETF) and TCP/IP lbl_tcp_3 are shown in Figs. 16 and 17, respectively. We can observe non-linearity in MD and inconstancy in LMD for the trait, indicating one more time the evidence of multifractal in the traffic traces used. 6.6 Global scaling parameter HEG In Table 5 we evaluate the global scaling parameter obtained under the proposed modeling method by comparing it with the Hurst parameter estimated through a Whittle Estimator [29]. Diverse real traffic traces were used for this compassion, namely lbl_tcp_3, dec_pkt_1_10ms, dec_pkt_1_40ms, BC_Aug89_10ms) [22] and wireless traffic trace Traf_IETF [21]. dec_pkt_1_10ms and dec_pkt_1_40ms are two time series compiled from dec_pkt_1 series, under 10 and 40 ms of aggregation, respectively. Similarly BC_Aug89_10ms series is derived from BC_Aug89 on 10-millisecond aggregation scale. It can be seen that the global scaling parameter values computed using the derived expression (33) are very close to the corresponding Hurst Parameter values. As a Table 5 Hurst and global parameter
123
Traffic trace
Hurst parameter (H) Whittle estimator
Global scaling parameter HEG
lbl_tcp_3
0.8420
0.8691
dec_pkt_1_10ms
0.7770
0.7806
dec_pkt-1_40ms
0.8310
0.8315
Bc_Aug89_10ms
0.8520
0.8732
Traf_IETF
0.7360
0.7489
Modern network traffic modeling
result, the global scaling parameter H E G can be viewed as an alternative measure for self-similarity. 7 Conclusion In this work, we presented a novel method of building a multiplicative cascade suitable for multifractal traffic modeling. This new conservative, multiplicative cascade-based traffic model reveals its capability of more reliably capturing some important features of a multifractal process, such as scaling function and the moment factor. We evaluated in detail some statistical properties of the proposed model and revealed the existing long-range dependent properties in the analyzed real wireless traffic. We also obtained a global-scale parameter for our traffic model which is comparable with the Hurst parameter. Most importantly, we have a very simple expression for this parameter estimation, which can be an interesting alternative for Hurst parameter estimation. The results of our experimental investigations on real wireless and TCP\IP traffic traces have demonstrated the efficiency of proposed traffic model, capable of capturing the main features of multifractal traffic traces. Therefore, we strongly believe that our model could be simple, robust and alternative option for modern network traffic modeling work. Appendix Appendix A: Proof of Lemma 1 Let X (t0 ) = 2n .Y.R(b1 )R(b1 b2 ) . . . R(b1 . . . bk ) for N 1. Thus, the mean of process X (t0 ) is given by: E[X (t 0 )] = E[2n .Y.R(b1 )R(b1 b2 ) . . . R(b1 . . . bk )].
(39)
Using the fact that E[X (t N )] = 2−N , we have: E[X (t0 )] = E[Y ].
(40)
Assuming Y have lognormal distribution: E[X (t 0 )] = E[Y ] = em+v
2 /2
.
(41)
Appendix B: Proof of Lemma 2 Using variance definition Var[X (t 0 )] = E[X (t 0 )2 ] − E[X (t 0 )]2 .
(42)
123
J. W. de Godoy Stênico, L. L. Ling
Then Var[X (t 0 )] = E[Y 2 ]22N E[R 2 ] N − E[X (t 0 )]2
(43)
Var[X (t 0 )] = E[Y 2 ]22N E[R 2 ] N − E[Y ]2 .
(44)
or
Therefore Var[X (t 0 )] = e
2m+2v 2
.2
2N
.
(α + β)(α + β + 1) (α + 1)α
N − e2m+v
2
(45)
Appendix C: Proof of Lemma 4 Let m = 2k and x(N −k) = (N − k) can be represented as
km
(k−1)m+1
X (k), the aggregate process X m at stage
X m = 2−k x(N −k)
(46)
The variance of aggregate process X m can be written as Var[X m ] = E[(2−k x(N −k) )2 ] − E 2 [(2−k x(N −k) )]
(47)
Using the fact that the process X can be viewed as the product of a lognormal random variable Y and a multiplicative cascade μ, the following relation is valid: x(N −k) =
km
Y μ(tk ).
(48)
(k−1)m+1
Using Eq. (46) and moments of the aggregate process of the multiplicative cascade given by: m m 1 1 X (i)q = (Yi μ(ti ))q m m i=1
(49)
i=1
we have ⎡⎛ 1 Var[X m ] = E ⎣⎝2−k k 2 ⎡⎛
kn
Y μ(ti )2 ⎠⎦
(k−1)n+1
1 −E 2 ⎣⎝2−k k 2
123
⎞⎤
kn (k−1)n+1
⎞⎤ Y μ(ti )⎠⎦
(50)
Modern network traffic modeling
2(N −k) 1 Var[X ] = E Y 2 E R − E[Y ] 2 N −k α(α + 1) 2 2 − e2m+v 22k−2N . Var[X m ] = e2m+2v (α + β)(α + β + 1) m
2
2N
2
N −k
2
(51) (52)
References 1. Leland W, Taqqu M, Willinger W, Wilson D (1994) On the self-similar nature of ethernet traffic (extended version). IEEE/ACM Trans Netw 2(1):1–15 2. Riedi RH, Crouse MS, Ribeiro VJ, Baraniuk RG (1990) A multifractal wavelet model with application to network traffic. IEEE Trans Inf Theory. Special Issue on Multiscale Signal Analysis and Modeling, 45: 992–1018 3. Vieira FHT, Lee LL (2009) Adaptive wavelet based multifractal model applied to the effective bandwidth estimation of network traffic flows. IET Communications. pp 906–919 4. Krishna PM, Gadre VM, Desai UB (2003) Multifractal based network traffic modeling. Kluwer Academic Publishers, Boston, MA 5. Xu Z, Wang L, Wang K (2011) A new multifractal model based on multiplicative cascade. Inf Technol J 10:452–456 6. Peltier R, Véhel JL (1995) Multifractional brownian motion: definition and preliminary results. Technical Report 2695, INRIA 7. Vieira FHT, Bianchi GR, Lee LL (2010) A network traffic prediction approach based on multifractal modeling. J High Speed Netw 17(2):83–96 8. Kolmogorov AN (1941) The local structure of turbulence in a compressible liquid for very large Reynolds numbers. CR (Dokl) Acad Sci URSS (NS) 30:301–305 9. Aloud M, Tsang E, Dupuis A, Olsen R (2011) Minimal agent-based model for the origin of trading activity in foreign exchange market. In: Computational intelligence for financial engineering and economics (CIFEr), pp 1–8 10. Paschalis A, Molnar P, Burlando P (2012) Temporal dependence structure in weights in a multiplicative cascade model for precipitation. Water Resour Res 48:W01501 11. Stephen DG, Anastas JR, Dixon JA (2012) Scaling in cognitive performance reflects multiplicative multifractal cascade dynamics. In: Frontiers in physiology, fractal physiology 3:102. doi:10.3389/ fphys.2012.00102 12. Feldmann A, Gilbert AC, Willinger W (1997) Data networks as cascades: investigating the multifractal nature of internet WAN traffic. Proc. Of 35th Annual Allerton Conf. on communications, control, and computing, pp 269–280 13. Fisher A, Calvet L, Mandelbrot BB (1997) Multifractality of Deutschmark/US dollar exchanges rates. Yale University 14. Dang TD, Molnár S, Maricza I (2003) Queuing performance estimation for general multifractal traffic. Int J Commun Syst 16(2):117–136 15. Mandelbrot BB (1974) Intermittent turbulence in self-similar cascades: divergence of high moments and dimension of the carrier. J Fluid Mech 62:331–358 16. Guivarc’h Y (1987) Remarques sur les solutions d’ une equation fonctionnelle non linéaire de Benoît Mandelbrot. Comptes Rendus (Paris) 305I(139):1987 17. Stolojescu-Crisan C, Isar A, Moga S, Lenca P (2013) WiMax traffic analysis and base stations classifications in terms of LRD. Expert Systems 30(4):285–293. doi:10.1111/exsy.12026 18. Jizba P, Korbel J (2013) Modeling financial time series: multifractal cascades and Rényi entropy. In: Interdisciplinary symposium on complex systems emergence, complexity and computation. ISCS 2013, vol 8, pp 227–236 19. Seuret S, Lévy-Véhel J (2000) The local holder function of a continuous function. Appl Comput Harmon Anal 13(3):263–276 20. Beran J (2010) Long-range dependence Wiley interdisciplinary reviews. Comput Stat 2(1):26–35 21. http://ita.ee.lbl.gov/html/traces.html. Last accessed Mar 2014
123
J. W. de Godoy Stênico, L. L. Ling 22. Jardosh A, Krishna NR, Kevin CA, Belding E (2014) CRAWDAD Data Set UCSB/IETF-2005 (v. 2005–10-19). out. 2005. http://crawdad.cs.dartmouth.edu/ucsb/ietf2005. Last accessed Mar 2014 23. Falconer K (2003) Fractal geometry: mathematical foundations and applications. Second Edition Wiley; 2 edition 24. Véhel JL (2013) Large deviation multifractal analysis of a class of additive processes with correlated non-stationary increments. IEEE/ACM Trans Netw 21(4):1309–1321 25. Crouse MS, Baraniuk RG, Ribeiro VJ, Riedi RH (2000) Multiscale queueing analysis of long-range dependent traffic. Proc IEEE INFOCOM 2:1026–1035 26. http://www.cubinlab.ee.unimelb.edu.au/darryl/ms_code.html. Last accessed Mar 2014 27. Pavlov AN, Anishchenko VS (2007) Multifractal analysis of complex signals. Phys Uspekhi 50:819– 834 28. Zhang ZL, Ribeiro VJ, Moon S, Diot C (2003) Small-time scaling behaviors of internet backbone traffic: an empirical study”, INFOCOM 2003. Twenty-Second Annual Joint Conference of the IEEE Computer and Communications Societies 3:1826–1836 29. Karagiannis T, Faloutsos M (2014) SELFIS: A tool for self-similarity and long-range dependence analysis. 1st Workshop on Fractals and Self-Similarity in Data Mining: Issues and Approaches (in KDD) Edmonton, Canada, 2002. SELFIS—Downloaded from http://alumni.cs.ucr.edu/~tkarag/Selfis/ Selfis.html. Last accessed Mar 2014
123