Environ Earth Sci (2016)75:685 DOI 10.1007/s12665-016-5435-6
ORIGINAL ARTICLE
Modeling river discharge time series using support vector machine and artificial neural networks Mohammad Ali Ghorbani1 • Rahman Khatibi2 • Arun Goel3 Mohammad Hasan FazeliFard1 • Atefeh Azani1
•
Received: 18 March 2014 / Accepted: 5 February 2016 Ó Springer-Verlag Berlin Heidelberg 2016
Abstract Discharge time series were investigated using predictive models of support vector machine (SVM) and artificial neural network (ANN) and their performances were compared with two conventional models: rating curve (RC) and multiple linear regression (MLR) techniques. These models are evaluated using stage and discharge data from Big Cypress River, Texas, USA. Daily river stage– discharge data for the period of April 2010 to August 2013 were used for training and testing the above models and their results were compared using appropriate performance criteria. The evaluation of the results includes different performance measures, which indicate that SVM and ANN have an edge over the results by the conventional RC and MLR models. Notably, peak values predicted by SVM and ANN are more reliable than those by RC and MLR, although the performances of these conventional models are acceptable for a range of practical problems. The paper & Mohammad Ali Ghorbani
[email protected];
[email protected] Rahman Khatibi
[email protected] Arun Goel
[email protected] Mohammad Hasan FazeliFard
[email protected] Atefeh Azani
[email protected] 1
Department of Water Engineering, Faculty of Agriculture, University of Tabriz, Tabriz, Iran
2
GTEV-ReX Limited, Mathematical Modeling Consultant, Swindon, UK
3
Department of Civil Engineering, National Institute of Technology, Kurukshetra, Haryana 136119, India
projects a critical view on inter-comparison studies by seeing through model selection approaches based on the common practice of the absolute best or even the best for the stated purpose towards uncertainty analysis. Keywords Artificial neural network Support vector machine Big Cypress River
Introduction River stage–discharge relationships play an important role in modeling towards planning and management of river basins, assessment of risk, control of floods and droughts, and development of water resources (Khatibi et al. 2012). However, there is a difference between the role of stage and discharge values in providing an understanding into the hydraulic behavior of river systems. Stage values in river hydraulics reflect local effects such as roughness, channel geomorphology and interactions between the main channel and floodplain flows; whereas discharge values provide an overview of general river hydraulics in terms of responsiveness and attenuation. There are methodological problems with continuous recording of discharge time series for most of practical requirements, whereas continuous recording of stage values is well established. For the estimation of continuous records of river discharge, stage values are recorded and then discharge values are calculated from the functional relationship between stage and discharge values. This relationship is established in advance and referred to as the rating curve (RC) or stage–discharge relationships. In recent years, artificial intelligence techniques such as artificial neural networks (ANN), support vector machines (SVM), fuzzy logic, genetic programming, and many other methods have been widely used in hydrology and water
123
685
Page 2 of 13
resources applications (Kisi and Cobaner 2009). This paper uses models based on SVM as proposed by Vapnik (1998). It is a powerful tool for nonlinear classification, regression and time series prediction problems (Wang et al. 2008). SVM belongs to kernel-based learning approaches as a form of supervised machine learning system that uses a linear high dimensional hypothesis space called feature space and as such, the SVM has gained a wide popularity. The working principle of the SVM is that it uses kernel functions implicitly, mapping the data to a higher dimensional space (Bhagwat and Maity 2012). Examples using the SVM capability include: Stage–discharge modeling (Goel and Pal 2012; Sivapragasam and Muttil 2005; Aggarwal et al. 2012), stream flow or stage (Liong and Sivapragasam 2002; Asefa et al. 2004; Yu et al. 2006), runoff and sediment yield (Misra et al. 2009) and rainfall-runoff modeling (Dibike et al. 2001). ANN is an adaptable system that by learning relationships from the input and output datasets is able to predict datasets not observed previously but with similar characteristics associated with the input datasets (Haykin 1999; ASCE Task Committee 2000). ANN has been applied to stage–discharge modeling, which typically performs better than conventional models, e.g., Tawfik et al. (1997), Jain and Chalisgaonkar (2000), Deka and Chandramouli (2003), Sudheer and Jain (2003), Bhattacharya and Solomatine (2005), Habib and Meselhe (2006), Baiamonte and Ferro (2007), Clemmens and Wahlin (2006), Ajmera and Goyal (2012) and Hasanpour Kashani et al. (2013).
Fig. 1 Location of the study site
123
Environ Earth Sci (2016)75:685
RC, MLR, ANN and SVM models extract information from data in their own particular ways. The tendency is to identify the best performing models in terms of using appropriate performance measures such as correlation coefficient. This paper inevitably uses these performance measures, but suggests that modeling practices are in need of critical thinking. This can mean directing different modeling techniques towards the culture of uncertainty analysis, as well as using modeling techniques to understand the results with respect to the data and specific features of their regions. This paper models daily discharge time series of Big Cypress River, Texas, USA, using, MLR, ANN and SVM.
Materials and methods Materials The data for daily mean river stage and discharge time series of Big Cypress River were used in this study. Figure 1 shows the study area, which has a drainage area of 720 km2, in which the gauge is located at the downstream side of the bridge at US Highway 271, Texas (USGS Station No: 07344493, Latitude 33°040 22.5500 , Longitude 94°570 54.8200 ). The stage–discharge data are readily available from the web server of the USGS and with known quality checks. The use of these data was not driven by any
a
500
Training Period
Testing Period
400
Stage (cm)
Fig. 2 Time series plot for the data period of (2010–2013): a stage; b discharge
Page 3 of 13
300 200 100
b
30
Discharge (m3/s)
Environ Earth Sci (2016)75:685
25
Training Period
0
20 15 10 5
200
400
600
800
0
1000 1200
200
400
Training Testing
Data type
Number of data
Sx
Cv
Csx
800
1000 1200
xmax
xmin
Stage (m)
991
0.25
0.29
4.20
3.11
Discharge (m3/s)
991
1.26
2.82
9.09
21.86
0.03
Stage (m)
248
0.2
0.25
3.04
2.15
0.63
Discharge (m3/s)
248
0.74
1.83
5.07
6.63
0.08
project work but this research uses them to test the different models. The data are divided to training and testing periods. The daily stage–discharge data from 01 April 2010 to 16 December 2012 (80 % of the whole dataset) were used for training and the data for 17 December 2012 to 21 August 2013 (20 % of the whole dataset) for testing. Figure 2a, b shows the observed daily mean stage and discharge values for the training and testing periods respectively. All the data were normalized during the computation stage so that the data ranged between 0 and 1. Such data scaling smoothens the solution space and averages out some of the noise (ASCE 2000). Table 1 presents the statistical parameters of the stage–discharge data such as xmeans (mean), Sx (standard deviation), Cv (coefficient of variation), Csx (skewness), xmax (maximum), and xmin (minimum). Table 1 shows the discharge values, which range between 0.03 and 21.86 m3/s and the stage values, which range between 0.59 and 3.11 m. Also skewness and coefficient of variation for discharge are greater than those of stage values. Stage–discharge rating curve (RC) The traditional method of rating curves describes a nonlinear relationship between stage and discharge as follows: Q ¼ cðh h0 Þm
600
day
day
Dataset
Testing Period
0
0
Table 1 Daily statistical parameters of river stage and discharge datasets
685
ð1Þ
where Q is discharge, h is stage and c, m, h0 are calibration parameters. The basic assumption underlying Eq. (1) is existence of an appropriate relationship between depth h and discharge Q. The constants h0, m and c are estimated using observed data on h and Q. The value of h0
0.59
corresponds to the bed level at zero discharge. Traditionally, best values of h0, m and c in Eq. (1) are obtained by the least square error method for a given range of stage values (e.g. Sivapragasam and Muttil 2005). Multiple linear regression (MLR) Multiple linear regression (MLR) uses more than one predictor variable. The general form of the MLR model is: y ¼ c0 þ c1 x1 þ c2 x2 þ þ cn xn
ð2Þ
where y is the expected value represented as a function of n-number of independent variables x0 ; x1 ; . . .; xn , in which the values of the coefficients, c0 ; c1 ; . . . ; cn , are unknown. These values represent the local behavior and are estimated by the least square method, or some other regression methods (e.g. Kisi and Cobaner 2009). Artificial neural networks (ANNs) ANNs are parallel information processing systems consisting of a set of neurons arranged in layers. These neurons provide suitable conversion functions for weighted inputs. The type of ANN used in this study is a multi-layer feedforward perceptron (MLP), where the network is trained with the use of back propagation learning algorithm. The three layered structure of the MLP consists of an Input layer, a Hidden layer and an Output layer. The input layer accepts the data; the hidden layer processes them; and the output layer displays the resultant outputs of the model. The MLP network is trained through the following procedure: Input–output data are presented to ANN as the training data, which make a large proportion of the time series data available; actual network outputs are calculated
123
685
Page 4 of 13
Environ Earth Sci (2016)75:685
for the current inputs after the application to the activation functions; performance measure is selected, e.g. mean square error (MSE) and the values are calculated; and connection weights and biases are adjusted to minimize the MSE. The above steps are repeated for each training period, until no significant change is detected in the MSE for each training period. The final connection weights are kept unchanged for the remaining testing data when used as input test data and are presented to the network to produce the corresponding output consistent with the internal representation of the input/output mapping (see Haykin 1999, among others for further information).
the input vectors that have nonzero Lagrangian multipliers under the KKT condition (Yoon et al. 2011). In natural processes, the predictor variables (input space) are almost always non-linearly related to the predicted variable. This limitation is solved by mapping the input space onto some higher dimensional space (feature space) using a kernel function. The kernel function makes it possible to implicitly work in a higher dimensional feature space. Subsequently Eq. (3) becomes the explicit function of Lagrangian multipliers or ai and ai as follows: f ðx; ai ; ai Þ ¼
N X
ðai ; ai Þ Kðx; xi Þ þ b
ð5Þ
i¼1
Support vector machines (SVM) The SVM method is based on statistical learning theory (Vapnik 1998). The SVM is one type of neural networks, which has been receiving increasing attention in pattern classification and nonlinear regression estimation (Cao and Tay Francis 2003) due to its generalization performance. Consider a given training data with a number of samples, represented by ðx1 ; y1 Þ; . . .; ðxN ; yN Þ, where x is an input vector and y is its corresponding output value. The SVM estimator f on regression can be represented by: f ðxÞ ¼ w /ðxÞ þ b
ð3Þ
where w is a weight vector, b is bias, ‘‘’’ denotes the dot product and / is a non-linear mapping function. A smaller value of w indicates the flatness of the function on x, which can be obtained through minimizing the Euclidean norm defined by w2 . Vapnik (1995) introduced the following convex optimization problem with an e insensitivity loss function for solving Eq. (4): N X 1 nþ kwk2 þC k þ nk 2 K¼1 8 ð w /ðxk Þ þ bÞ e þ nþ y > k < k subject to yk þ ðw /ðxk Þ þ bÞ e þ n k > : þ nk ; nk 0
In SVM modeling, when the RBF is used, the parameters to optimize during the training process are: r, C and e; and this is similar to other approaches used as the kernel function. There are two main steps in developing a SVM model: (a) the selection of the kernel function, and (b) the identification of the specific parameters of the kernel function, i.e. C and e. Many studies on the use of SVM in hydrological modeling have demonstrated a better performance of the RBF (Khan and Coulibaly 2006; Lin et al. 2006; Liong and Sivapragasam 2002; Yu et al. 2006). Performance criteria
minimize
k ¼ 1; 2; . . .; N
ð4Þ where C is a positive tradeoff parameter that determines the degree of the empirical error in the optimization problem and determines the trade-off between the flatness of the function and the amount to which deviations larger than e þ are tolerated. Also n k ; nk are slack variables, which represent upper and lower constraints on the output system over the error tolerance, e (Misra et al. 2009). Lagrangian multipliers and imposing the Karush–Kuhn–Tucker (KKT) method are used to solve the optimization problem given in Eq. (4) in a dual form. The inequality constraints are converted into equations by the KKT method through adding or subtracting slack variables. Support vectors are
123
where ai ; ai are parameters associated with support vector xi , N is the number of training samples, and K ðx; xi Þ is the kernel function. Commonly used kernel functions include the linear, polynomial, radial basis, and sigmoid functions. In this study, the radial basis function (RBF) kernel is used, which is expressed as: ! kx x i k2 kðx; xi Þ ¼ exp : ð6Þ r2
Three performance criteria are used in this study to assess the goodness of fit of the models. They are: determination coefficient (R2), root mean square error (RMSE) and the Nash–Sutcliffe efficiency coefficient (E). The R2 values, ranging from 0 to 1, are a statistical measure of how well the regression line is close to the observed data and a coefficient of 1 indicates that the fit of the regression line to the observed data is ‘perfect.’ The RMSE provides a balanced evaluation of the goodness of fit of the model as it is more sensitive to the larger relative errors caused by low values. The ‘perfect’ model will have a RMSE value of zero. The E values range between -? and 1.0, with a value of 1 for ‘perfect’ fit (as further discussed by Ghorbani et al. 2013). The combined use of R2, RMSE and E provides a sufficient assessment of each model’s performance and allows a comparison of the accuracy of the four modeling strategies used in this study.
Environ Earth Sci (2016)75:685
Page 5 of 13
Results and discussion Input selection The RC is an empirical model, which extracts information from recorded stage values. The models shown in Table 2 are an expansion of the principle underlying the RC, in which different model structures were selected by a combination of stage and/or discharge variables including: Ht, Ht-1, Ht-2 and Ht-3 representing stage values at times (t), (t - 1), (t - 2), (t - 3), (t - 4) and (t - 5) and Qt-1, Qt-2 and Qt-3 representing discharge values at times (t - 1), (t - 2) and (t - 3). Models 1–4 extract information from stage values alone; Models 5–7 extract information from discharge values alone with a structure similar to autocorrelation; and Models 8–9 extract information from both stage and discharge values. While it is clear that the RC is not the only technique, Table 2 shows that possible different choices of dependent variables for each modeling technique but these are required to have parsimonious structures by avoiding overfitting. The performance measures outlined in ‘‘Performance criteria’’ were used to select the particular model structure but other techniques will be used to assist the choice, e.g. scatter diagram, or plotting difference between individual model performances. Rating curve The RC model was implemented using the least squares method leading to: Q ¼ 2:121 105 ðh 37:479Þ2:45
Table 2 Input parameters used for the RC, MLR, ANN and SVM models
Table 3 Results of RC model for the training and testing period
ð7Þ
685
The performance of this model for the training and testing periods is summarized in Table 3. It shows that the performance of the RC model, is acceptable with R2 = 0.985, RMSE = 0.16 m3/s and E = 0.985 for the training period; and R2 = 0.968, RMSE = 0.15 m3/s and E = 0.960 for the testing period. As expected, the quality of the results drops from the training period to the testing period. Figures 3 compares the predictions of RC model with observed data for the testing dataset. The results suggest good agreement between predicted and observed discharges driven by the R2 value but RMSE values signify deviations from the mean value. Hence this indicates that one performance measure is not enough to assess the performance of a model. Additionally, RMSE does not provide any specific understanding of the deviations at the peak values. See Fig. 3, which shows that peak values are underestimated and this should be flagged in the overall performance of this model. MLR The MLR model was implemented using MATLAB 2013a (The MathWorks Inc 2012) to derive regression coefficients using the training dataset. The results given in Table 4, show the values of the performance criteria (R2, RMSE and E). These values show that Model 8 (dependent variables of: Ht, Ht-1, Qt-1, see Table 2) performs relatively better than the others with an R2 = 0.87, RMSE = 0.45 m3/s and E = 0.869 for the training data, and R2 = 0.943, RMSE = 0.26 m3/s and E = 0.874 for the testing dataset. Notably, the results improved somewhat from the training period to the testing period, but experience shows that this is not of common occurrence. The regression equation suggested by this linear technique is given by:
Model no.
Model definitions
Output
Applied to
1
Ht
Qt
RC, MLR, ANN, SVM
2
Ht, Ht-1
Qt
MLR, ANN, SVM
3
Ht, Ht-1, Ht-2
Qt
MLR, ANN, SVM
4
Ht, Ht-1, Ht-2, Ht-3
Qt
MLR, ANN, SVM
5
Qt-1
Qt
MLR, ANN, SVM
6
Qt-1, Qt-2
Qt
MLR, ANN, SVM
7
Qt-1, Qt-2, Qt-3
Qt
MLR, ANN, SVM
8
Ht, Ht-1, Qt-1
Qt
MLR, ANN, SVM
9
Ht, Ht-1, Qt-1, Qt-2, Qt-3
Qt
MLR, ANN, SVM
10
Ht, Ht-1, Ht-2, Qt-1, Qt-2, Qt-3
Qt
MLR, ANN, SVM
Optimum model no.
Training R
1
2
0.985
Testing 3
RMSE (m /s)
E
R2
RMSE(m3/s)
E
0.16
0.985
0.968
0.15
0.960
123
Page 6 of 13
Environ Earth Sci (2016)75:685
Fig. 3 Comparison of observed and predicted discharge and scatter diagrams—RC model, testing period
8
8 Observed RC
Discharge (m3/s)
7 6
predicted (m3/s)
685
5 4 3 2
y = 0.9108x - 0.0089 R² = 0.9676
6 4 2
1 0
0 1
51
101
151
0
201
Table 4 Results of different input combinations using MLR model for training and testing periods
No. input combination
Training R
2
4
6
8
Observed (m3/s)
day
Testing
2
3
RMSE (m /s)
E
R2
RMSE(m3/s)
E
1
0.797
0.57
0.797
0.860
0.38
0.733
2
0.807
0.55
0.806
0.891
0.37
0.743
3
0.817
0.53
0.816
0.899
0.38
0.736
4
0.818
0.53
0.818
0.905
0.38
0.742
5
0.442
0.936
0.442
0.261
0.65
0.237
6
0.462
0.91
0.462
0.272
0.65
0.235
7
0.465
0.91
0.465
0.274
0.65
0.235
8
0.870
0.45
0.869
0.943
0.26
0.874
9
0.869
0.45
0.868
0.944
0.26
0.873
10
0.869
0.45
0.869
0.943
0.27
0.866
Underline: counterintuitive improvements in the performance for the testing period Italic: deterioration due to possible ill-conditions in the model structure Bold: best performing model (Model 8)
8
10
7
Observed
6
MLR
predicted (m3/s)
Discharge (m3/s)
Fig. 4 Comparison of observed and predicted discharge and scatter diagram—MLR model, data for testing period
5 4 3 2
8 y = 1.197x - 0.1015
R² = 0.9432
6 4 2
1 0
0 1
51
101
151
day
201
0
2
4
6
Observed
(m3/s)
8
Qt ¼ 1:3851 þ 0:0529Ht 0:03439Ht1 þ 0:5534 Qt1 ð8Þ
mathematical formulation of multiple regression method, see e.g. Guven and Aytek 2009.
Figure 4 shows a reasonably good agreement between predicted and observed discharge using the MLR model for the testing period, but it should be flagged that peak values are overestimated. Also, Fig. 4 shows that MLR contains predicted negative discharge values but this stems from
ANN model
123
The ANN modeling was implemented using the MATLAB 2013a software (The MathWorks Inc 2012). A logarithmic sigmoid transfer function was used in the hidden layer and
Environ Earth Sci (2016)75:685
Page 7 of 13
a linear transfer function was employed from the hidden layer to the output layer as an activation function, which is known to be robust for continuous output variables. The network was trained in 1000 epochs using the Levenberg– Marquardt learning algorithm with a learning rate of 0.001 and a momentum coefficient of 0.9. The optimum number of neurons in the hidden layer was identified using a trial and error procedure by varying the number of hidden neurons from 2 to 20. The effect of changing the number of the hidden neurons on the R2, RMSE and E for each combination is presented in Table 5, according to which the performance of Model 8 (dependent variables of: Ht, Ht-1, Qt-1, see Table 2) or ANN (3,10,1) is relatively better than other combinations. In this model the model structure comprises 3 inputs, 10 hidden neurons and 1 output nodes. The performance variables for the training period are R2 = 0.997, RMSE = 0.06 m3/s and E = 0.997; and those for the testing period are R2 = 0.984, RMSE = 0.09 m3/s and E = 0.983. As expected, the quality of the results drops from the training period to the testing period.
Table 5 Results of different input combination using ANN models for the training and testing periods
No. input combination
685
Figure 5 presents the details of the observed and predicted discharges and their corresponding scatter diagram for the best fit ANN (3,10,1) model for the testing period. It is clearly noted that the ANN method improves on both the performance measures and on the quality of the peak discharge values closely emulating observed values (more about this, in the next section). SVM model The program for SVM was constructed using MATLAB (The MathWorks Inc 2012). In this study the RBF kernel with parameters (C, e, r) were used for stage–discharge modeling, with the accuracy of the SVM model being dependent on the identified parameters. In this study, the parameter search scheme employed is the shuffled complex evolution algorithm (SCE-UA), see Lin et al. 2006; Yu et al. 2006). The SCE-UA technique has been used successfully in the area of surface and subsurface hydrology processes (Duan et al. 1994).
Model structure
Training R
2
Testing 3
RMSE (m /s)
E
R2
RMSE (m3/s)
E
1
ANN(1-14-1)
0.994
0.10
0.993
0.976
0.12
0.974
2
ANN(2-11-1)
0.995
0.09
0.994
0.981
0.12
0.979
3
ANN(3-5-1)
0.993
0.11
0.992
0.979
0.11
0.976
4
ANN(4-20-1)
0.995
0.09
0.995
0.977
0.12
0.974
5
ANN(1-11-1)
0.710
0.69
0.699
0.331
0.61
0.329
6
ANN(2-19-1)
0.506
0.88
0.499
0.270
0.68
0.158
7
ANN(3-17-1)
0.611
0.77
0.610
0.324
0.66
0.217
8
ANN(3-10-1)
0.997
0.06
0.997
0.984
0.09
0.983
9
ANN(5-12-1)
0.997
0.07
0.996
0.981
0.10
0.980
10
ANN(6-17-1)
0.997
0.07
0.996
0.980
0.11
0.979
Italic: deterioration due to possible ill-conditions in the model structure Bold: best performing model (Model 8)
8
8
7
Observed
6
ANN
Predicted (m3/s)
Discharge (m3/s)
Fig. 5 Comparison of observed and predicted discharge and scatter diagram—ANN model, testing period
5 4 3 2
y = 0.9967x - 0.0089 R² = 0.9842
6 4 2
1 0
0 1
51
101
151
day
201
0
2
4
Observed
6
8
(m3/s)
123
685
Page 8 of 13
Environ Earth Sci (2016)75:685
Table 6 Results of different input combination using SVM model for training and testing periods No. input combination
Training R
2
Testing 3
RMSE (m /s)
E
R
2
Optimum parameters 3
RMSE (m /s)
E
C 9 105
,
e 0.03
1
0.993
0.11
0.993
0.979
0.11
0.976
43.7
9
2
0.936
0.36
0.916
0.968
0.14
0.963
62.3
155
0.16
3
0.989
0.13
0.989
0.747
0.42
0.675
0.3
5.50
0.20
4
0.881
0.53
0.819
0.939
0.20
0.925
590.2
240
0.25
5
0.797
0.56
0.797
0.207
0.69
0.130
0.005
0.30
0.10
6
0.453
0.94
0.426
0.270
0.63
0.269
108.1
159
0.19
7 8
0.331 0.995
1.12 0.09
0.190 0.995
0.192 0.985
0.69 0.09
0.129 0.984
6 0.006
14 16
0.0001 0.08
9
0.884
0.44
0.873
0.948
0.19
0.932
278.0
208
0.40
10
0.866
0.50
0.835
0.945
0.18
0.942
42.3
214
0.30
Italic: counterintuitive improvements in the performance for the testing period Underline: deterioration due to possible ill-conditions in the model structure Bold: best performing model (Model 8)
8
8
7
Observed
6
SVM
Predicted (m3/s)
Discharge (m3/s)
Fig. 6 Comparison of observed and predicted discharge and scatter diagram—SVM model with RBF kernel, testing period
5 4 3 2
y = 0.9882x - 0.0101 R² = 0.985
6 4 2
1 0
0 1
51
101
151
201
day
To obtain suitable values of these parameters (C, e, r), the RMSE was used to optimize parameters. The results of the RBF kernel based for each data model definitions (see Table 2) are given in Table 6 in terms of R2, RMSE and E. For each combination of inputs, the values of the kernel parameters (C, e, r) were based on minimizing RMSE values. The performance criteria show that input Model 8 (dependent variables of: Ht, Ht-1, Qt-1, see Table 2) with kernel parameters (617.7, 15.7 and 0.08) performs better than the other combinations. For the training datasets the values of the performance measures are R2 = 0.995, RMSE = 0.09 m3/s and E = 0.995; and these values for the testing period were R2 = 0.985, RMSE = 0.09 m3/s and E = 0.984. Figure 6 displays the observed and predicted discharge values and scatter diagram for the SVM model for the testing period. The results show further improvements on predicted discharge in terms of reduced scatter. In particular, there is better agreement in the predicted peak values compared with their observed values.
123
0
2
4
6
8
Observed (m3/s)
Comparisons of results Table 2 shows a set of 10 model structures for each of the modeling strategies of MLR, ANN and SVM. The results summarized in Table 7 show that Model 8 performs better for these three strategies, in which Qt is treated as a function of: Ht, Qt-1 and Ht-1. The overall comparison also indicates that the models for the prediction of discharge are acceptable, other than Models 5, 6 and 7, which should be rejected, as their R2 vlaues are very low (less than 0.3) and the other performance measures are also poor. Based on Table 7, the ANN model performs slightly better that than the SVM model for the training period, which produces the highest R2 (0.997), the lowest RMSE (0.06), and highest E value (0.997). Likewise, the table shows that the ANN and SVM perform better than the RC and this performs better than the MLR. Further comparison of the stage and discharge values of the modeling techniques with their corresponding observed values are
Environ Earth Sci (2016)75:685
Page 9 of 13
685
Table 7 Performance measures for different modelled discharge values Model
Training
Testing RMSE (m /s)
E
R2
RMSE (m3/s)
E
0.985
0.156
0.984
0.968
0.1465
0.960
0.870
0.453
0.869
0.943
0.262
0.874
Model 8; ANN(3-10-1)
0.997
0.064
0.997
0.984
0.094
0.983
Model 8 (C, e, r) of (617.7, 15.7, 0.08)
0.995
0.088
0.995
0.985
0.092
0.984
Model structure
R
RC
–
MLR
Model 8
ANN SVM
2
3
Bold: best performing model (Model 8)
8
7
7
Observed RC
6
6
Discharge (m3/s)
Discharge (m3/s)
Fig. 7 Comparison of RC, MLR, ANN and SVM models with observed values for testing period
5 4 3 2 1 0 50
100
150
200
Observed MLR
5 4 3 2 1 0 50
250
100
7 6 5 4 3 2 1 0
150
200
250
200
250
Stage (cm) 7
Observed SVM
50
100
Discharge (m3/s)
Discharge (m3/s)
Stage (cm)
150
Stage (cm)
presented in Fig. 7 for the testing period. The figure shows that the performance of the SVM and ANN models are closer to the observed values than that of the RC model, and the RC performs better than the MLR model. The MLR model is linear and therefore this should explain its relatively poor performance. Further information contained in Tables 3, 4, 5 and 6 (the results for the RC, MLR, ANN and SVM models, respectively) is related to the change of quality in the performance measures from the training to testing periods. One expects intuitively some drop in the performance quality but this is not as a rule and the results in the table require a careful examination, as follows. In the first place, these tables reveal that the drop in the quality for Models 5, 6 and 7 is striking, which treat Qt in terms of a combination of: Qt-1 and Qt-2 and Qt-3. Notably, discharge values are not measured directly but are calculated from stage values and therefore they normally contain errors, which are referred to as data errors or initial errors.
200
250
6
Observed ANN
5 4 3 2 1 0
50
100
150
Stage (cm)
When data errors have significant impacts on the solution, the problem is known as ill-conditioning and hence Models 5, 6 and 7 evidently suffer from ill-conditioning and on this account their use should be ruled out. The change in the quality of performance measures associated with the remaining model structures still offer some food for thought. The ANN modeling strategy shows a small drop in the quality of performance measures from the training to testing periods for the remaining models (involving one or more of stage values). However, some of these model structures processed by the SVM and MLR models perform counter-intuitively and show some improvements. Notably, the change in the quality of the results from the training to the testing periods is also implicit in other published results, e.g. Khatibi et al. (2012), Ghorbani et al. (2015), Samsudin et al. (2011) and Nayak et al. (2005). A more discriminating insight emerges in Fig. 8, which displays the difference between predicted and observed
123
Page 10 of 13
Fig. 8 Residuals from predicted daily discharge and observed values for testing period using RC, MLR, ANN and SVM models
Environ Earth Sci (2016)75:685 1 0.5
Residuals
685
0
0
50
100
150
-0.5
200
250 ANN RC MLR SVM
-1 -1.5
Time (day)
discharge values (the residuals) against time for the four modeling strategies. The figure shows that the residuals of local peaks are relatively insignificant for SVM and ANN and tend towards overestimating discharge; whereas the RC and MLR have larger residuals, in which the MLR model tends to overestimate discharge values but the RC tends to underestimate discharge values. Arguably, if a model is not producing accurate results, the conservative results would be preferable but the definition of conservative approach depends on the nature of the problems, as in some problems overestimation would mean safety, e.g. flood risk management but in some other problems that means underestimation e.g. the design discharge for hydroelectric power stations. In practice, the parametric performance measures provide little reason to choose between the ANN and SVM model performances, as their differences are typically not significant and for these data both perform equally well. However, the implementation of the ANN requires considerably more time to arrive at the final architecture than for the SVM model with only three parameters to be optimized. The results in this study, and indeed similar studies by others, provide an anecdotal evidence for the overall understanding. They show limits of generalizing, e.g. the expectation that the quality of model performance measures during the testing periods should drop could not be generalized or the outcomes of comparative performances at local levels cannot be readily anticipated.
Discussion The underlying objective in studying the performance of different modeling methods is to provide evidence that a particular model is fit for the purpose and can be used as a modeling tool. Then there are many possible modeling strategies. Their comparative study is a way of obtaining an insight into the underlying issues but it seems that this has given rise to the misplaced aspiration of identifying
123
superior models. Arguably, this aspiration has not led to the formulation of any principle as previous inter-comparison studies conducted by others indicate that some models perform better than others but not always. A direct and systematic comparison of different studies is not quite possible, as their underlying variations are very large with respect to their mathematical formulations, relevant parameters and model structures, regional variations in the data and the nature of the problems being modeled. For instance, consider such researches reported by and Bhattacharya and Solomatine (2005); Sivapragasam and Muttil (2005); Guven and Aytek (2009); Kisi and Cobaner (2009); Aggarwal et al. (2012); Ajmera and Goyal (2012); Goel and Pal (2012); Hasanpour Kashani et al. (2013). To illustrate the problem, consider a set of five investigations reported in the literature by the various researchers on the subject of stage–discharge modeling. These are summarized in Table 8. The summary of the five specific studies in Table 8 shows a wide variation for their main features but definitely there is no pointer towards any single most successful or superior modeling technique. Existence of any superior technique may conform to certain human mindsets but is not natural and not conformal to pluralistic Nature. The philosophical consideration under this finding is that models are not expressions of any predetermined truth but as Khatibi (2012) argues they are just surrogates for any other world, such as physical or social systems and as such model emulations of the behavior in the surrogate worlds are just to aid understanding and decision-making to deduce a better sense of the environment. Inter-comparison studies aiming to identify a best model has also led to detracting from some models and one example includes an R&D work commissioned by the Environment Agency for England and Wales, which downplayed the accuracy of rainfall-runoff models using transfer function (see Bell et al. 2001). This led to a further R&D work to review transfer functions for clarifying their performance, see Sene and Tilford (2004) and also Sene et al. (2004).
Environ Earth Sci (2016)75:685
Page 11 of 13
685
Table 8 Inter-comparison of a selected number of modeling studies involving SVM and ANN river depth-flow relationships He et al. (2014)
Ajmera and Goyal (2012)
Asefa et al. (2004)
Kisi and Cobaner (2009)
Rasouli et al. (2012)
Models used
ANN, ANFIS, SVM
M5P ? 3 ANN variations ? RC
SVM and TFN
3 ANN variations ? regression Tech
BNN, SVM and GP, MLR
Character of the study area
Small river basin of Semiarid mountainous with complex topography
A major tributary creek of a large river
A major river regulated at several locations with varied land use with dominant snowmelt flows
Large well-established river system
A rather large catchment with a mixed pluvial– nival system
Data resolutions
6-year daily Q data (2001–2003) and (2009–2011). Datapoints: 2190 two sets: training (2001–03); validation set 2009–2011 Forecasting river flow
October 2004 to January 2006 with short and long gaps; the dataset: 45,024 pairs of datapoints at 15 min intervals
Hourly and annual with 23 years of recorded values
Daily stream flow (1983–2001) with 1983–1997 record for training, and 1998–2001 for model testing
Modeling Q–Y process; comparing M5P with ANN and RC
Predicting short (hourly) and longterm (annual) flow volumes or unregulated streams
Daily Q–Y from 3 stations daily record using Oct-1998 to Sep1999 for training and Oct 1999 to Sep 2000 for validation for each station Develop non-linear Q–Y rating
Study objectives
Forecasting with 1–7 days lead time
Performance criteria
CC, RMSE, MARE, Nash–Sutcliffe E
CC, RMSE, MAE
R2, RMSE
RMSE, MAE and CC
CC, MAE, RMSE and Nash– Sutcliffe E
Main findings
Models with three antecedent flows perform best with performance measures not varying substantially. SVM performed better than ANN and ANFIS
M5P superior to ANN (for both the high and low flows) and RC; M5P outperformed when fewer data events were used; M5P has high consistency between training and testing phases
6 month ahead annual flow forecasts volume improved using data from more than 1 station. Satisfactory results for 24 h ahead hourly forecasts using past hourly flows and snowmelt
ANN (radial basis NN) is found to be slightly better than others variations of ANN and this better than regression Tech
The nonlinear models generally outperformed MLR, and BNN tended to slightly outperform the other nonlinear models
BNN Bayesian neural network, GP Gaussian process, M5P model tree analogous to piecewise linear functions, RC traditional rating curve, TFN transfer function noise
The outcome of the above review is that the search for the absolute best model is likely to be an effort in vain, and the results in this study show that it is not quite possible to identify a model that performs the best throughout. However, this study shows that the way forward is to understand the performance of the individual models and use as many models as possible to gather more evidence for the selected models. This approach can be framed as the model selection towards the stated purpose and the purpose in this study was to make use of any available information but the purpose can be wide, fitting the modeling culture. Thus, the question is that is there any modeling guidance for assessing the fitness of the model and possibly selecting the fit ones? The clue is that in Table 2, as the results in Tables 5, 6 and 7 indicate that indeed some of Models 5–7 have poor performances with some of the modeling strategies but the rest perform equally well. Arguably, all
the models that pass certain threshold of fitness should be selected for a further probabilistic assessment of predicted values. In this practice, the more modeling technique is selected, the greater contribution towards the assessment of their probabilistic predictions. Thus, for each time, instead of a single modeled value, there could be an ensemble of simulated values contributing to the assessment of inherent uncertainty. In this way, models of data-driven time series would be transformed into more scientific approaches of risk-based modeling (e.g. see Khatibi et al. 2003 and Tilford et al. 2007) or uncertainty studies. Arguably, this paper promotes a shift in the modeling culture from simple inter-comparison studies towards using more models for uncertainty analysis. This would complement perturbing parameters in ensemble modeling. This has budgetary implications but many of the underlying processes can be automated minimizing the budgetary requirements.
123
685
Page 12 of 13
Conclusion The performances of four modeling techniques reported by this study attempt to provide evidence for suitable techniques for predicting discharge values. The techniques studied were rating curve (RC), multiple linear regression (MLR), artificial neural networks (ANNs) and support vector machines (SVM). This study used a combination of stage and discharge time series for the prediction of discharge values as the primary input data for the Big Cypress River, Pittsburg, Texas, USA, with the data spanning the period of 2010–2013. The study shows that the SVM, ANN and RC models display a clear edge over the MLR model in predicting discharge values, which may be explained by their nonlinear mathematical formulations. The RC model performs better than the MLR in terms of performance measures but tends to overestimate peak values, whereas the MLR model has acceptable performance measures but is comparatively poor in terms of underestimating peak discharge values. The results in this study show that the ANN model performed better in its training period, and dropped in quality more than that by the SVM model during the testing period; however, this cannot be generalized. The advantage of ANN is its flexibility, even though its implementation is time consuming, SVM and ANN models perform better than RC and MLR in predicting peak discharge values and stage–discharges relationships and hence more suitable for being used in flood simulation studies. The paper shows that conventional MLR and RC models would still be acceptable for most of practical problems but there would be uncertainties associated with their predicted peak values. Improvements by more sophisticated techniques such ANN or SVM models are at the expense of mathematical complexity and this may create a barrier for their uptake to practical problems by professional mathematical modelers. One such barrier is the culture of seeking for the best model, but the results presented in this paper show the best model is unlikely to exist. Specific applications are likely to indicate the applicability of a particular technique for a particular problem. Equally, the paper suggests the need for ensemble modeling and embracing uncertainty cultures in time series analysis. Acknowledgments The data used in this study were downloaded from the web server of the USGS. The author wishes to thank the staff of the USGS who are associated with data observation, processing, and management of USGS Web sites. Thanks are also due to the anonymous reviewers for many useful suggestions.
References Aggarwal SK, Goel A, Singh VP (2012) Stage and discharge forecasting by SVM and ANN techniques. Water Resour Manag 26:3705–3724
123
Environ Earth Sci (2016)75:685 Ajmera TK, Goyal MK (2012) Development of stage–discharge rating curve using model tree and neural networks: an application to Peachtree Creek in Atlanta. Expert Syst Appl 39:5702–5710 ASCE Task Committee on Application of Artificial Neural Networks in Hydrology (2000) Artificial neural networks in hydrology I: preliminary concepts. J Hydrol Eng 5:115–123 Asefa T, Kemblowski MW, Urroz G, McKee M, Khalil A (2004) Support vector based groundwater head observation networks design. Water Resour Res. doi:10.1029/2004WR003304 Baiamonte G, Ferro V (2007) Simple flume for flow measurement in sloping open channel. J Irrig Drain Eng ASCE 133:71–78 Bell VA, Carrington D S, Moore RJ (2001) Comparison of rainfallrunoff models for flood, forecasting part 2: calibration and evaluation of models R&D Technical Report W242 Bhagwat PP, Maity R (2012) Multistep-ahead river flow prediction using LS-SVR at daily scale. J Water Resource Prot 4:528–539 Bhattacharya B, Solomatine DP (2005) Neural networks and M5 model trees in modeling water level–discharge relationship. Neurocomputing 63:381–396 Cao LJ, Tay Francis EH (2003) Support vector machine with adaptive parameters in financial time series forecasting. IEEE Trans Neural Netw 14:1506–1518. doi:10.1109/TNN.2003.820556 Clemmens AJ, Wahlin BT (2006) Accuracy of annual volume from current-meter-based stage discharges. J Hydraul Eng-Asce 11:489–501 Deka P, Chandramouli V (2003) A fuzzy neural network model for deriving the river stage discharge relationship. Hydrolog Sci J 48:197–209 Dibike YB, Velickov S, Solomatine D, Abbott MB (2001) Model induction with support vector machines: introduction and applications. J Comput Civil Eng 15:208–216 Duan QY, Sorooshian S, Gupta VK (1994) Optimal use of the SCEUA global optimization method for calibrating watershed models. J Hydrol 158:265–284 Ghorbani MA, Khatibi R, Hosseini B, Bilgili M (2013) Relative importance of parameters affecting wind speed prediction using artificial neural networks. Theor Appl Climatol 114:107–114 Ghorbani MA, Khatibi FazeliFard MH, Naghipour L, Makarynskyy O (2015) Short-term wind speed predictions with machine learning techniques. J Meteorol Atmos Phys. doi:10.1007/s00703-0150398-9 Goel A, Pal M (2012) Stage–discharge modeling using support vector machines. IJE Trans A Basics. doi:10.5829/idosi.ije.2012.25. 01a.01 Guven A, Aytek A (2009) A new approach for stage–discharge relationship: gene-expression programming. J Hydraul Eng ASCE 14:812–820 Habib EH, Meselhe EA (2006) Stage–discharge relations for lowgradient tidal streams using data driven models. J Hydraul Eng ASCE. 132:482–492 Hasanpour Kashani M, Daneshfaraz R, Ghorbani MA, Najafi MR, Kisi O (2013) Evaluation of capabilities of different methods for development of a stage–discharge curve of the Kizilirmak River. J Flood Risk Manage. doi:10.1111/jfr3.12064 Haykin S (1999) Neural networks: a comprehensive foundation. Macmillan Publishing, New York He Z, Wen X, Liu H, Du J (2014) A comparative study of artificial neural network, adaptive neuro fuzzy inference system and support vector machine for forecasting river flow in the semiarid mountain region. J Hydrol 509:379–386 Jain SK, Chalisgaonkar D (2000) Setting up stage–discharge relations using ANN. J Hydraul Eng ASCE 5:428–433 Khan MS, Coulibaly P (2006) Application of support vector machine in lake water level prediction. J Hydrol Eng 11:199–205
Environ Earth Sci (2016)75:685 Khatibi R (2012) Evolutionary transitions in mathematical modelling complexity by using evolutionary systemic modelling—formulating a vision. In: Lynch JR, Derek T, Williamson DT (eds) Chapter 5: Natural selection: biological processes, theory and role in evolution. https://www.novapublishers.com/catalog/pro duct_info.php?products_id=41527 (this may be accessed in: https://www.researchgate.net/publication/285860237_EVOLU TIONARY_TRANSITIONS_IN_MATHEMATICAL_MODEL ING_COMPLEXITY_BY_EVOLUTIONARY_SYSTEMICS) Khatibi R, Gouldby B, Sayers P, McArthur J, Roberts I, Grime A, Akhondi-asl A (2003) Improving coastal flood forecasting services of the Environment Agency. In: McInnes RG (ed) Proc. of the 1st International Conference on Coastal Management, Brighton, UK, pp 70–82 Khatibi R, Sivakumar B, Ghorbani MA, Kisi O, Kocak K, Farsadi Zadeh D (2012) Investigating chaos in river stage and discharge time series. J Hydrol 414–415:108–117 Kisi O, Cobaner M (2009) Modeling river stage–discharge relationships using different neural network computing techniques. Clean Soil Air Water 37:160–169 Lin JY, Cheng CT, Chau KW (2006) Using support vector machines for long-term discharge prediction. Hydrolog Sci J 51:599–612 Liong SY, Sivapragasam C (2002) Flood stage forecasting with support vector machines. J Am Water Resour 38:173–186 Misra D, Oommen T, Agarwal A, Mishra SK, Thompson AM (2009) Application and analysis of support vector machine based simulation for runoff and sediment yield. Biosyst Eng 103:527–535 Nayak PC, Sudheer KP, Rangan DM, Ramasastri KS (2005) Shortterm flood forecasting with a neurofuzzy model. Water Resour Res 41:W04004. doi:10.1029/2004WR003562 Rasouli K, Hsieh WW, Cannon AJ (2012) Daily streamflow forecasting by machine learning methods with weather and climate inputs. J Hydrol 414–415:284–293 Samsudin R, Saad P, Shabri A (2011) River flow time series using least squares support vector machines. Hydrol Earth Syst Sci 15:1835–1852. doi:10.5194/hess-15-1835
Page 13 of 13
685
Sene K, Tilford K (2004) Review of transfer function modelling for fluvial flood forecasting R&D Technical Report W5C-013/6/TR Sene KJ, Tilford KA, Khatibi R (2004) Rainfall runoff flood forecasting models for fast response catchments. In: Proc. IMA/flood risk net conference on flood risk assessment, Bath, September 2004 Sivapragasam C, Muttil N (2005) Discharge rating curve extension— a new approach. Water Resour Manag 19:505–520 Sudheer KP, Jain SK (2003) Radial basis function neural network for modeling rating curves. J Hydrol Eng 8:161–164 Tawfik M, Ibrahim A, Fahmy H (1997) Hysteresis sensitive neural network for modeling rating curves. J Comput Civil Eng 11:206–211 Tilford KA, Sene KJ, Khatibi R (2007) Flood forecasting model selection—a new approach. In: Begum S, Hall JW, Stive MJF (eds) Flooding in Europe: challenges and developments in flood risk management, vol 25, pp 401–416. (http://www.springer. com/earth?sciences?and?geography/hydrogeology/book/9781-4020-4199-0) The MathWorks Inc. (2012) Matlab the language of technical computing. http://www.mathworks.nl/products/matlab/. Retrieved 4 Sept 2012 Vapnik VN (1995) The Nature of Statistical Learning Theory. Springer, New York Vapnik VN (1998) Statistical learning theory. Wiley, New York Wang W, Men C, Lu W (2008) Online prediction model based on support vector machine. Neurocomputing 71:550–558 Yoon H, Jun SC, Hyun Y, Bae GO, Lee KK (2011) A comparative study of artificial neural networks and support vector machines for predicting groundwater levels in a coastal aquifer. J Hydrol 396:128–138 Yu PS, Chen ST, Chang IF (2006) Support vector regression for realtime flood stage forecasting. J Hydrol 328:704–716
123