Neural Comput & Applic DOI 10.1007/s00521-015-2055-0
ORIGINAL ARTICLE
Performance evaluation of hybrid Wavelet-ANN and Wavelet-ANFIS models for estimating evapotranspiration in arid regions of India Amit Prakash Patil1 • Paresh Chandra Deka1
Received: 26 September 2014 / Accepted: 26 August 2015 Ó The Natural Computing Applications Forum 2015
Abstract This paper evaluates the ability of wavelet transform in improving the accuracy of artificial neural network (ANN) and adaptive neuro-fuzzy interface systems (ANFIS) models. In this study, the performance of hybrid Wavelet-ANN and Wavelet-ANFIS models for estimating daily evapotranspiration in arid regions was evaluated. Prior to the development of models, gamma test was used to identify the best input combinations that could be used under limited data scenario. Performance of the proposed hybrid models was compared to ANN, ANFIS, and conventionally used Hargreaves equation. The results revealed that use of wavelet transform as data preprocessing technique enhanced the efficiency of ANN and ANFIS models. Wavelet-ANN and Wavelet-ANFIS performed reasonably better than other models. Better handling of wavelet-decomposed input variables enabled Wavelet-ANN models to perform slightly better than the Wavelet-ANFIS models. W-ANN2 (RMSE = 0.632 mm/day and R = 0.96) was found to be the best model for estimating daily evapotranspiration in arid regions. The proposed W-ANN2 model used second-level db3 wavelet-decomposed subseries of temperature and previous day evapotranspiration values as inputs. The study concludes that hybrid Wavelet-ANN and Wavelet-ANFIS models can be effectively used for modeling evapotranspiration. Keywords Evapotranspiration Arid region Limited data Gamma test Wavelet transform ANN ANFIS
& Amit Prakash Patil
[email protected] 1
Department of Applied Mechanics and Hydraulics, National Institute of Technology Karnataka, Surathkal, Mangalore, India
1 Introduction Managing irrigation systems in arid and semiarid climates is a difficult task owing to the limited availability of water resources and overexploitation of the existing ones. As evapotranspiration plays a vital role in determining crop water requirement, accurate measurement of evapotranspiration becomes evident. Normally lysimeters are used for direct measurement of evapotranspiration. However, high operating costs and need for accuracy in measurements have limited the use of lysimeters [1]. In the nineteenth century, researchers developed various physical, empirical, and semiempirical equations that used meteorological variables to estimate reference crop evapotranspiration (ETo). The Food and Agricultural Organization of United Nations (FAO) has accepted the FAO Penman– Monteith (FAO-56PM) as the standard equation to estimate ETo [2]. Large requirement of climatic variables has limited the use of FAO-56PM equation in developing countries such as India, where the availability of these records has often been minimal. Additionally, the performance of empirical equations using fewer climatic variables [3–6] is often found to be inconsistent when tested under different climatic conditions [7–14]. In the recent years, use of artificial intelligence (AI) techniques such as ANN and ANFIS for modeling intricate hydrological processes has increased significantly [15–18]. ANN models are found to be good at estimating evapotranspiration under different climatic and data availability conditions [19–24]. Sudheer et al. [25] examined the potential of ANN models in estimating actual crop evapotranspiration from limited climatic data. Zanetti et al. [26] proposed an ANN model that used only maximum and minimum air temperature data to estimate ETo. Rahimikhoob [27] compared the performance of Hargreave and
123
Neural Comput & Applic
ANN methods for estimating ETo in semiarid environments. Tabari and Hosseinzadeh Talaee [28] evaluated the performance of multilayer perceptron models for estimating ETo in semiarid regions. Many studies conclude that, compared to ANN and empirical equations, ANFIS models are better at estimating ETo. For a given set of input–output pairs, ANFIS model uses neural network architecture to construct fuzzy ‘‘if– then’’ rules with appropriate membership function. Shiri et al. [29] developed a global, generalized neuro-fuzzy model using datasets from humid and non-humid regions. Karimaldini et al. [30] investigated the potential of ANFIS in modeling daily evapotranspiration using limited weather data for arid conditions. Cobaner [31] compared the performance of grid partition and subtractive clustering-based fuzzy interface system. From the comparisons, it was found that subtractive clustering-based fuzzy interface system was more accurate with fewer amount of computations. Tabari et al. [32] studied the potential of support vector machines, ANFIS, multiple linear regression, and multiple nonlinear regression for estimating ETo. Performance of any AI model largely depends on the user’s understanding of the model, along with the quantity and quality of inputs presented to it. Evapotranspiration, like many other hydrological processes, operates under a large range of scales varying from 1 h to several months leading to a nonlinear and non-stationary dataset. ANN and ANFIS models alone may not be able to cope with these characteristics of the dataset if input and/or output data are not preprocessed. Furthermore, there is also a need for working toward enhancing the performance of these models. Use of hybrid models may help in increasing the accuracy of ANN and ANFIS models. Recently, wavelet transform technique is being widely used for time series analysis [33]. Wavelet transform decomposes original time series into wavelet subseries of different resolutions in the time domain [34]. Wavelet preprocessed data may increase the efficiency of an AI model by providing useful information about original dataset on various resolution levels [35]. Izadifar [36] used cross wavelet analysis to explore correlation between evapotranspiration and meteorological variables. Cross wavelet analysis was also used for input determination of AI models. Falamarzi et al. [37] proposed wavelet neural network model for forecasting daily ETo values from temperature and wind speed data. The proposed model used neural network with one hidden layer and wavelet function as an activation function. Wang and Luo [38] used wavelet transform to decompose ETo time series into different frequency components. The wavelet-decomposed subseries were then used as inputs to ANN models for forecasting ETo.
123
This paper attempts to evaluate the performance of hybrid Wavelet-ANN and Wavelet-ANFIS models for estimating evapotranspiration under a limited data scenario in arid regions of India. Hybrid Wavelet-ANN and Wavelet-ANFIS models were developed by preprocessing the input variables with discrete wavelet transform and presenting the decomposed subseries to ANN and ANFIS models. To the best of the authors’ knowledge, only a few studies are undertaken to compare the performance of different wavelet-based hybrid AI models. Additionally, in most of the studies undertaken, the inputs for evapotranspiration models are chosen according to the similarity between the empirical models and the defined AI model. However, different input combinations work well for different sites. In this study, gamma test was employed to identify the input combination that will yield more efficient model under limited data scenario. Use of the proposed hybrid models together with gamma test for identifying input variables would help in increasing the accuracy of estimating evapotranspiration under limited data scenario.
2 Methodology This section introduces gamma test, which was used to identify the inputs for proposed models. Further, discrete wavelet transform, the data preprocessing technique is described. A short introduction to ANN and ANFIS is also included. 2.1 Gamma test Selection of proper inputs plays a vital role in improving the efficiency of any AI model. It is expected that one should select explicit inputs for extracting an accurate model out of the available database. A nonlinear modeling and analysis technique called gamma test (GT) [39] can be used to evaluate the efficiency of different input combinations in modeling the output function. GT allows us to measure the extent to which a smooth relationship can be established between input and output without relying on information about a specific machine-learning model [40]. The GT is a nonlinear continuous modeling and analysis tool, which estimates the minimum root mean square error during modeling the unseen data and allows examination of the input–output relationship in a numerical dataset. The basic idea of gamma test is presented as follows; suppose a set of data observations fðXi ; Yi Þ; 1 i M g exists where the input vectors Xi 2 Rm are m dimensional (with a record length of M) and confined to some closed bounded set C 2 Rm with a corresponding scalar outputs Yi 2 R, the relationship between input X and output Y can be given as:
Neural Comput & Applic
Y ¼ f ðXi ; . . .; Xm Þ þ r
ð1Þ
where f is a smooth function and r is a random variable that represents noise. The gamma test estimates the proportion of the variance of Y related to the function f and the random noise variable r. Gamma test does not measure the fitness of data to a line so it is not related to the shape of function; it distinguishes noise and smooth relationships [15]. Gamma test scores give an indication of the unaccountable variance that exists between input and output datasets. It is a very useful statistic for comparing performances of different input variables in modeling the desired output. GT is used in some hydrological studies to select the most efficient input combinations for different AI models [41]. As the objective of this paper was to evaluate the performance of AI models for estimating ETo under limited data scenario, gamma test was used to determine the input variable that is capable of modeling the process of evapotranspiration more efficiently. 2.2 Discrete wavelet transform Wavelet transform is basically used for the analysis of nonstationary time series. In wavelet transform, the original time series is broken down into its ‘‘wavelets’’ which is a scaled and shifted version of the mother wavelet. The main advantage of using wavelet transform is that it provides both time and frequency information simultaneously. This time–frequency representation of wavelet transform is not available in traditional Fourier and short-term Fourier transform (STFT). Continuous wavelet transform (CWT) was developed to overcome the drawbacks in STFT by changing the scale of analysis window and shifting the window in time. The lower scales are used to follow the rapidly changing (high frequency) details of the signal, whereas the higher scales indicate slowly changing (low frequency) component of the signal. The CWT of a continuous time series signal x (t) is given by Eq. (2). Z 1 t s 1 CWTðs; sÞ ¼ pffiffi xðtÞw dt ð2Þ s s 1 In Eq. (2), the transformed signal is a function of the scale and translation factor (s and s) of the function w(t). w is the complex conjugate function of transforming function w(t) which is usually called as mother wavelet [42]. Mathematically, the mother wavelet can be defined as Z 1 wðtÞdt ¼ 0 1
Calculating wavelet coefficients at every possible scale is a fair amount of work and generates voluminous data, whereas discrete wavelet transform (DWT) is simple and requires less time. Hence, in this study, DWT was used to
decompose the input datasets. DWT uses a digital filtering system where the original time series is passed through a high-pass and a low-pass filter. The original signal is separated into low frequency approximation (containing trends) and high frequency details (containing fast events).The discrete wavelet representation can be given as t ns0 sm m=2 0 wm;n ðtÞ ¼ s0 w ð3Þ sm o where s is the wavelet scale, t is the time, and s is the translation parameter, whereas m and n are the time and scale controlling integers. The scale s0 is fixed dilation step greater than one, and time t0 is the location parameter greater than zero. For this study, the most obvious and simplest choice for s0 and t0 were two and one (dyadic grid arrangement). 2.3 Artificial neural networks ANN, an information processing system, simulates the ability of a human brain to sort out patterns and learn from trial and error. It has an ability to extract relationships that exist within the data. Typically, ANN architecture consists of a series of processing elements called neurons. Neurons are arranged in layers namely input layer, output layer, and one or more hidden layers. Each layer is fully connected to the next layer by interconnection weights. Weights are first randomly assigned to all of the connections in the network. These initial values of weights are then corrected during a training (learning) process. The weights in the hidden and output layer neurons can be calculated using Eqs. (4) and (5), respectively. wðN þ 1Þ ¼ wðN Þ gd/ wðN þ 1Þ ¼ wðN Þ þ gx
r X
ð4Þ dq
ð5Þ
q¼1
where w is weight, N is number of iteration, x is input value, g is learning weight, and / is the output. In the equations above, d is defined as 2eq q //q I, where I is the sum of the weighted inputs, q is the neuron index of the output layer, and eq is the error signal. This training method is the standard backpropagation training algorithm wherein the estimated outputs are first compared to the known outputs, and then, the errors that occur are backpropagated to obtain the appropriate weight adjustments necessary in minimizing the errors. 2.4 Adaptive neuro-fuzzy interface systems Jang [43] proposed a method that used neural network learning algorithm for constructing a set of fuzzy if–then rules from stipulated input–output pairs. Fundamentally, ANFIS is a functional equivalent of fuzzy interface
123
Neural Comput & Applic
systems, endowed with neural learning capabilities. An ANFIS model combines the transparent and linguistic representation of a fuzzy system with learning ability of ANN. This allows them to be trained in performing input– output mapping as an ANN model. ANFIS comes with an additional benefit of being able to provide a set of rules on which the model is based. To build up a fuzzy system, linguistic variables are used in place of or in addition to the numerical variables. Then, ‘‘if–then’’ rules are formed to characterize simple relationships between fuzzy variables. In first-order Sugeon’s system, a typical rule set with two fuzzy ‘‘if–then’’ rules can be expressed as Rule 1 : If x is A1 and y is B1; then f 1 p1 x þ q1 y þ r1 Rule 2 : If x is A2 and y is B2; then f 1 p2 x þ q2 y þ r2 where x and y are the input and output variables, respectively. The A and B terms denote linguistic terms of the precondition part with membership function. The if-part of the rule ‘‘x is A’’ is called the antecedent or premise, while the then-part of the rule ‘‘y is B’’ is called the consequent or conclusion. The p, q, and r are the consequent parameters. ANFIS consists of five layers, and the basic functions of each layer are input, fuzzification, rule inference, normalization, and defuzzification. A detailed description of ANFIS can be found in Jang [43].
3 Preparation of dataset The climatic database is obtained from the Jodhpur weather station (26°180 N latitude and 73°010 E longitude) operated and maintained by India Meteorological
Department (IMD), Government of India. The Jodhpur district, located in The Great Indian Desert, is classified as arid region (BW) according to the Koppen climate classification. Extreme heat in summer and cold winters is the characteristic of a desert and Jodhpur being no exception, the temperature may vary from a maximum of about 49 °C in summer to a minimum of about 1 °C in winter. The Jodhpur district receives an annual average rainfall of about 326 mm, and the rainy days are often limited to 30 days. The soil here is mainly classified as sandy and loamy. Jodhpur weather station is equipped with standard ground-based instruments such as sunshine recorder, alcohol and wet-bulb thermometers, cup anemometer, sunshine recorder, and mercury thermometers. The weather records are transmitted from the weather stations to the IMD data center in Pune where data archives are maintained. The datasets are scrutinized and subjected to quality checks prior to the supply, but were further screened and checked for integrity as per the procedure described in FAO-56. The daily dataset obtained is composed of maximum temperature (Tmax), minimum temperature (Tmin), maximum relative humidity (RHmax), minimum relative humidity (RHmin), 24-h wind speed at 2 m height (U2), and actual sunshine shine hours (n). This study used 4-year data, out of which first 3-year data were used for training and the remaining 1-year data for testing. Table 1 presents the statistics of training and testing dataset. In the table, Xmean, Xmax, Xmin, Sx, Cv and Csx denote the mean, maximum, minimum, standard deviation coefficient of variation and skewness, respectively. For the datasets used, relative humidity shows a very high standard deviation. Coefficient of variation is significant for both wind speed and relative humidity. Temperature and relative sunshine datasets show a negative skewed distribution.
Table 1 Statistical properties of training and testing dataset Dataset Training
Statistics
Tmax (°C)
Tmin (°C)
RHmin (%)
U2 (m/s)
ETo (mm/day)
Xmean
34.04
20.50
0.77
53.09
24.98
2.22
5.83
45.10
32.30
1.00
99.00
92.00
7.19
11.93
Xmin Sx
18.80 5.84
6.10 6.52
0.01 0.21
8.00 20.91
2.00 16.53
0.39 1.15
1.98 2.24
0.17
0.32
0.27
0.39
0.66
0.52
0.38
Csx
-0.47
-0.47
-1.60
0.04
1.02
0.97
0.46
Xmean
34.13
20.52
0.78
53.01
25.30
2.15
5.73
Xmax
44.20
30.30
0.98
98.00
95.00
5.67
11.79
Xmin
18.60
4.20
0.01
15.00
1.00
0.36
2.08
Sx
5.54
6.48
0.20
20.43
16.83
1.06
2.20
Cv
0.16
0.32
0.26
0.39
0.67
0.49
0.38
-0.59
-0.11
-1.85
0.07
0.85
0.89
0.31
Csx
123
RHmax (%)
Xmax
Cv Testing
n/N
Neural Comput & Applic
3.1 Selection of inputs by using gamma test An essential task in developing an AI model is to determine the dependent (inputs) and independent variables (outputs). As this study emphasizes on developing AI models for limited data scenario, the most influential climatic parameters in the process of evapotranspiration (for the proposed site) were used as inputs to the AI models. Choosing influential input variables would provide a better predictive model under limited data scenario and make the data collection and processing easier. A gamma test was employed to identify the input variable, which has the least noise variance that cannot be modeled by any smooth model. The variables tested were namely maximum and minimum temperature (T), maximum and minimum relative humidity (RH), wind speed at 2 m elevation (U2), relative sunshine duration (n/N), and 1 day antecedent ETo (ETo t - 1) values. GT scores for each input were calculated separately, and the climatic parameter that provided the least GT score was assumed as the most competent input for estimating ETo under limited data scenario. The selected inputs were further used to model ETo using ANN, ANFIS, Wavelet-ANN, and Wavelet-ANFIS models. As lysimeters readings were not available for the study area, ETo values obtained from the FAO-56 PM equation were used as output for all the models. Computation for all the data necessary to calculate ETo was done according to the procedure prescribed in chapter 3 of FAO-56. Solar radiation is an important parameter for estimating ETo [44]. As solar radiation (Rs) measuring instruments such as pyranometer were not available, Rs was estimated using Angstrom formula [45], which relates solar radiation to extraterrestrial radiation and relative sunshine duration (n/ N). 3.2 ANN model development Feed-forward backpropagation network with Levenberg– Marquet (LM) algorithm for weight optimization was used in this study. This model has shown good results in the studies undertaken earlier. Referring to the recommendations given by many researchers, all the ANN models in this study used only one hidden layer. Different ANN architectures for modeling daily evapotranspiration were tested by varying the number of neurons in the hidden layer. A trial-and-error procedure was adopted to determine optimum number neurons in the hidden layer. Performance of the sigmoid activation functions in the hidden layer with linear activation function at the output node was also evaluated. Data normalization provides for initial weight allocation according to the distribution and not the magnitude of data. Hence, the inputs and outputs were first
normalized using min–max normalization. Datasets were normalized from 0 to 1 or from -1 to 1 according to the activation function used. 3.3 ANFIS model development ANFIS model used the same dataset sets as those used by the ANN models. A hybrid approach combining least square error and backpropagation was adopted to develop the ANFIS models. Performance of two, three, and four membership functions (MFs) with triangular, trapezoidal, Gaussian, and spline shapes was evaluated in this study. 3.4 Wavelet-ANN and Wavelet-ANFIS models For developing hybrid Wavelet-ANN and Wavelet-ANFIS models, input datasets were first decomposed into subseries using DWT. Then, these wavelet-decomposed subseries were used as inputs to improve the efficiency of ANN and ANFIS models. For the wavelet analysis, selection of appropriate mother wavelet functions becomes crucial. As dyadic wavelet transform was used in the study, orthogonal mother wavelets were used for wavelet decomposition. The pattern of datasets matched with the Daubechies wavelets (db); hence, performances of various db wavelets were evaluated. Daubechies wavelets are a family of orthogonal wavelets defining a discrete wavelet transform. They are further subclassified according to the number of vanishing moments they have. For db family wavelets, the smoothness of scaling and wavelet function increases with the number of vanishing moments. Daubechies wavelets db2, db3, db4, and db5 used in the study are represented in Fig. 1. Another important concern in the use of wavelet transform is the selection of optimum decomposition level. Effect of various decomposition levels on the performance of W-ANN model was also investigated. The maximum level of decomposition was selected using the formula L = int [log (N)], where L and N are decomposition level and number of time series data, respectively. In this study, N = 1453, so L = 3. The time series was decomposed into one, two, and three levels using the mother wavelets mentioned earlier. Figure 2 presents a second-level decomposition of maximum temperature (Tmax) dataset using db3 wavelet (approximation at level 2 and details at levels 1 and 2). 3.5 Conventional model Nandagiri and Kovoor [9] compared the performance of various empirical equations over a range of different Indian climates. They found that the Hargreaves equation works best for estimating evapotranspiration in arid regions of India. Hence, the results of the proposed AI models were
123
Neural Comput & Applic Fig. 1 Wavelet from the Daubechies family
compared to the temperature-based Hargreaves equation which is given as ETo ¼ 0:0023 ðTmean þ 17:8Þ ðTmax Tmin Þ0:5 Ra
ð6Þ
where ETo is the estimated reference crop evapotranspiration, Tmean is the average of maximum temperature (Tmax) and minimum temperature (Tmin), and Ra is the extraterrestrial radiation calculated as per the procedure given in FAO-56. 3.6 Model performances Correlation coefficient (R) and root mean square error (RMSE) were used to evaluate the model accuracies. R shows the possible linear association between two variables, while RMSE measures the difference between values predicted by the model and the values actually observed. Scatter diagrams were also used to evaluate the accuracies of the models.
4 Results and discussion This section analyzes the results derived from gamma test, ANN, ANFIS, hybrid Wavelet-ANN and WaveletANFIS models. Table 2 presents the GT scores of individual input variables for training, testing, and the entire dataset. It was found that in spite of testing different
123
lengths of datasets, the GT scores for all the three datasets were quite similar. It was further observed that previous day evapotranspiration values (GT score = 0.05) and temperature (GT score = 0.056) explained most amount of variance in the evapotranspiration process of arid region, whereas relative humidity (GT score = 0.197) explained the least amount of variance in the output function. These results are in conformance to the results obtained by researchers who found that temperature-based equations are good at estimation of ETo in arid regions [9]. After analyzing the gamma test results, it was decided to study two different input combinations for estimating daily ETo. The first combination used maximum temperature (Tmax), minimum temperature (Tmin), and extraterrestrial radiation (Ra). The second input combination used Tmax, Tmin, Ra, and previous day evapotranspiration value (ETo t - 1) as the fourth input. Ra for each day of the year and for particular latitude was estimated from the solar constant, the solar declination and the day of year. Performance measures of various AI models for training and testing period are presented in Tables 3 and 4. As it is difficult to include results of all models, results of only the best models for each modeling paradigm are presented in this paper. From Tables 3 and 4, it is evident that all the AI models are good at estimating daily ETo compared to the conventionally used Hargreaves equation. Further, the results indicate the presence of some amount of residual
Neural Comput & Applic
Fig. 2 Wavelet decomposition of maximum temperature time series decomposed using db3 wavelet (level 2)
Table 2 Results of gamma test for training, testing, and entire dataset Input variables
Gamma scores Entire set
Training
Testing
T
0.056
0.056
0.050
RH
0.197
0.192
0.193
U2
0.125
0.132
0.106
n/N
0.082
0.095
0.087
ETo t - 1
0.050
0.053
0.049
variance that cannot be explained by the limited input variables used to model the process. In the first stage of study, a number of ANN and ANFIS models were tested for estimating daily ETo. While developing ANN models, trial-and-error method was used to decide the optimum number of neurons in the hidden layer. Use of thumb rules [46–48] specifying relation between the number of inputs and neurons in the hidden layer did not work well, making trial and error a better option to determine the optimum number of neurons in the hidden layer. While developing ANFIS models, Gaussian
membership function was found to be working best for the given input combinations. It was also observed that the ANFIS model with three membership functions performed well for both the input combinations. In some instances, it was observed that increasing the number of membership functions deteriorated the ANFIS model performance. ANN1 and ANFIS1 models used the same inputs that Hargreaves model used. Performance of ANN1 and ANFIS1 model was better than the Hargreaves model (RMSE = 1.283, R = 0.82 during training and RMSE = 1.452, R = 0.91 during testing) for estimating daily ETo. Further, it was observed that the performance of both models improved when previous day ETo (ETo t - 1) value was added as an input. It is evident from Table 4 that ANFIS models are slightly better than ANN models at estimating daily ETo values. ANFIS2 (RMSE = 0.753, R = 0.94 during training and RMSE = 0.821, R = 0.94 during testing) was found to be the best model when raw (not decomposed by wavelet transform) datasets were used to model ETo. The results obtained show that ANFIS is more accurate in estimating evapotranspiration. This could be because ANFIS models use ANN to optimize fuzzy logic models, resulting into a good model.
123
Neural Comput & Applic Table 3 Comparison of Hargreaves, ANN, and W-ANN models for estimating ETo during training and testing period Name
Daily inputs
Structure
Epochs
Training
Testing
RMSE (mm/day)
R
RMSE (mm/day)
R
Hargreaves
Tmax, Tmin, Ra
–
–
1.283
0.82
1.452
0.83
ANN1
Tmax, Tmin, Ra
3-4-1
14
0.935
0.91
0.985
0.91
ANN2 W-ANN1
Tmax, Tmin, Ra, ETo t - 1 Tmax (A2, D1, D2), Tmin(A2, D1, D1), Ra
4-5-1 7-8-1
17 24
0.762 0.848
0.94 0.93
0.861 0.981
0.93 0.91
W-ANN2
Tmax (A2, D1, D2), Tmin(A2, D1, D1), Ra, ETo t – 1 (A2, D1, D2)
10-13-1
35
0.586
0.96
0.632
0.96
Table 4 Comparison of ANFIS and W-ANFIS models for training and testing period Name
Daily inputs
Training RMSE (mm/day)
Testing R
RMSE (mm/day)
R
ANNFIS1
Tmax, Tmin, Ra
0.868
0.92
0.975
0.91
ANFIS2
Tmax, Tmin, Ra, ETo t - 1
0.753
0.94
0.821
0.94
W-ANFIS1
Tmax(A2, D1, D2), Tmin(A2, D1, D1), Ra
0.787
0.94
0.988
0.91
W-ANFIS2
Tmax(A2, D1, D2), Tmin(A2, D1, D1), Ra, ETo t – 1 (A2, D1, D2)
0.725
0.94
0.642
0.96
Fig. 3 Time series comparison of observed and W-ANN2 predicted ETo values for testing period
In the second stage of the work, hybrid W-ANN and W-ANFIS models were employed to estimate ETo values. W-ANN and W-ANFIS models are almost similar to the earlier used ANN and ANFIS models from the architectural point of view. The only difference is that the ANN and ANFIS models used raw datasets, while the W-ANN and W-ANFIS models used wavelet subseries as inputs. As stated in the methodology, performance of different mother wavelets at three different levels of decomposition was tested in this study. It was found that db3 mother wavelet (Fig. 1) was the most efficient wavelet for data preprocessing in both ANN and ANFIS models. Further, it was also found that second level for wavelet decomposition yielded best results for all the models. W-ANN and W-ANFIS models were found to be better than the ANN and ANFIS models. The W-ANN1 model
123
with seven inputs (wavelet-decomposed subseries of inputs used by ANN1 model) and 10 neurons in hidden layer performed better (RMSE = 0.906, R = 0.92 during training and RMSE = 0.980, R = 0.91 during testing) than the ANN1 model. Performance of the W-ANFIS1 model was found to be slightly better than the W-ANN1 model in the training phase (RMSE = 0.787 and R = 0.94), but it deteriorated during the testing phase (RMSE = 0.988 and R = 0.91). While developing the W-ANFIS models, use of grid partitioning was not possible. Grid partitioning generates rules by enumerating all possible combinations of membership functions of all inputs. Use of grid partitioning even when using a moderate number of inputs may result in a large number of rules, leading to a problem referred as ‘‘curse of dimensionality.’’ In the present study, it was decided to use subtractive clustering instead of grid
Neural Comput & Applic
Fig. 4 Absolute errors of W-ANN2 model during testing period
Fig. 5 Scatter plot for ETo PM and estimated ETo values for testing period
123
Neural Comput & Applic
partitioning for W-ANFIS1 and W-ANFIS2 models. It is clear from Tables 3 and 4 that using wavelet-decomposed datasets, as inputs, enhanced the efficiency of both ANN and ANFIS models. However, in the case of W-ANFIS models, increase in the efficiency was not appreciable when compared to the W-ANN models. This could be because of the inability of ANFIS models to deal with the large number of inputs. W-ANN2 was found to be the best among all the models tested. W-ANN2 network architecture constituted of 10 inputs and 13 neurons in the hidden layer. Figure 3 shows a good agreement between estimated and observed ETo values by W-ANN2 model in the testing phase. It can be seen that the W-ANN2 model underestimated high values of ETo. However, low values of ETo were found to be both overestimated and underestimated at different times. Further, it was also observed that there is a slight shift of estimated values on the right side. Figure 4 presents absolute error in estimating ETo values while using the W-ANN2 model. The error was found to be below 0.2 mm/day for the major part of the testing period. Sudden spikes appearing were probably due to the inability of the model to consider the influence of high wind speeds on that particular day only. Figure 5 shows the scatter plots for the ANN2, W-ANN2, ANFIS2, and the W-ANFIS2 models for the testing period. Scatter plot shows that estimates of hybrid wavelet-ANN and Wavelet-ANFIS models are closer to the corresponding ETo PM values. The superiority of the W-ANN2 model is obvious in the scatter plot.
5 Conclusion In this study, performance of hybrid Wavelet-ANN and Wavelet-ANFIS models was compared to estimate daily evapotranspiration, using limited meteorological variables. The study was carried out in arid regions of India, and a gamma test was employed to identify the input variables for AI models. Performance of the hybrid models were compared to the ANN, ANFIS, and conventional Hargreaves equation. It was found that Wavelet-ANN and Wavelet-ANFIS models were more efficient than the ANN and ANFIS models. Wavelet-decomposed data improved the efficiency of the ANN and ANFIS models by providing useful information on various decomposition levels. The results obtained from the study indicate that hybrid Wavelet-ANN and WaveletANFIS models can be successfully used to estimate daily evapotranspiration. Further, it was also observed that performances of ANFIS models were better than the ANN models, whereas the performance of hybrid Wavelet-ANN models was better than the Wavelet-ANFIS models. Inability of ANFIS
123
models to handle large number of input variables can be a cause for this performance. It is suggested that further studies may be undertaken to explore the use of W-ANN in estimating evapotranspiration for other climatic regions. Acknowledgments The authors wish to thank the India Meteorological Department for providing the required data for this research. Also, the authors wish to thank the reviewers for their comments, which have significantly improved the original manuscript. Compliance with ethical standards Conflict of interest of interest.
The authors declare that they have no conflict
References 1. Lo´pez-Urrea R, de Santa Martı´n, Olalla F, Fabeiro C, Moratalla A (2006) Testing evapotranspiration equations using lysimeter observations in a semiarid climate. Agric Water Manag 85:15–26. doi:10.1016/j.agwat.2006.03.014 2. Allen R, Pereira L, Raes D, Smith M (1998) Crop evapotranspiration: guidelines for computing crop water requirements— FAO Irrigation and drainage paper 56. Food and Agricultural Organisation, Rome 3. Doorenbos J, Pruitt WO (1997) Crop water requirements. FAO irrigation and drainage. Paper No. 24 (rev.). FAO, Rome 4. Turc L (1961) Estimation of irrigation water requirements, potential evapotranspiration: a simple climatic formula evolved up to date. Ann Agron 12:13–49 5. Priestley CHB, Taylor RJ (1972) On the assessment of surface heat flux and evaporation using large scale parameters. Mon Weather Rev 100:81–92 6. Snyder RL (1992) Equation for evaporation pan to evapotranspiration conversions. J Irrig Drain Eng 118:977–980 7. Irmak S, Haman DZ, Jones JW (2002) Evaluation of class a pan coefficients for estimating reference evapotranspiration in humid location. J Irrig Drain Eng 128:153–159. doi:10.1061/ (ASCE)0733-9437(2002)128:3(153) 8. Martı´ P, Royuela A, Manzano J, Palau-Salvador G (2010) Generalization of ETo ANN models through data supplanting. J Irrig Drain Eng 136:161–174. doi:10.1061/(ASCE)IR.1943-4774. 0000152 9. Nandagiri L, Kovoor GM (2006) Performance evaluation of reference evapotranspiration equations across a range of Indian climates. J Irrig Drain Eng 132:238–249. doi:10.1061/ (ASCE)0733-9437(2006)132:3(238) 10. Rojas JP, Sheffield RE (2013) Evaluation of daily reference evapotranspiration methods as compared with the ASCE-EWRI Penman-Monteith equation using limited weather data in Northeast Louisiana. J Irrig Drain Eng 139:285–292. doi:10.1061/ (ASCE)IR.1943-4774.0000523 11. Sahoo B, Walling I, Deka BC, Bhatt BP (2012) Standardization of reference evapotranspiration models for a subhumid valley rangeland in the Eastern Himalayas. J Irrig Drain Eng 138:880–895. doi:10.1061/(ASCE)IR.1943-4774.0000476 12. Shiri J, Nazemi AH, Sadraddini AA, Landeras G, Kisi O, Fakheri F, Marti P (2014) Comparison of heuristic and empirical approaches for estimating reference evapotranspiration from limited inputs in Iran. Comput Electron Agric 108:230–241. doi:10.1016/j.compag.2014.08.007
Neural Comput & Applic 13. Trajkovic S, Kolakovic S (2009) Evaluation of reference evapotranspiration equations under humid conditions. Water Resour Manag 23:3057–3067. doi:10.1007/s11269-009-9423-4 14. Tabari H, Grismer ME, Trajkovic S (2011) Comparative analysis of 31 reference evapotranspiration methods under humid conditions. Irrig Sci 31:107–117. doi:10.1007/s00271-011-0295-z 15. Moghaddamnia A, Ghafari Gousheh M, Piri J, Amin S, Han D (2009) Evaporation estimation using artificial neural networks and adaptive neuro-fuzzy inference system techniques. Adv Water Resour 32:88–97. doi:10.1016/j.advwatres.2008. 10.005 16. Tabari H, Martinez C, Ezani A, Hosseinzadeh Talaee P (2013) Applicability of support vector machines and adaptive neurofuzzy inference system for modeling potato crop evapotranspiration. Irrig Sci 31:575–588. doi:10.1007/s00271-012-0332-6 17. Tabari H, Hosseinzadeh Talaee P, Abghari H (2012) Utility of coactive neuro-fuzzy inference system for pan evaporation modeling in comparison with multilayer perceptron. Meteorol Atmos Phys 116:147–154. doi:10.1007/s00703-012-0184-x 18. Tabari H, Marofi S, Sabziparvar A-A (2010) Estimation of daily pan evaporation using artificial neural network and multivariate non-linear regression. Irrig Sci 28:399–406. doi:10.1007/s00271009-0201-0 19. Chauhan S, Shrivastava RK (2008) Performance evaluation of reference evapotranspiration estimation using climate based methods and artificial neural networks. Water Resour Manag 23:825–837. doi:10.1007/s11269-008-9301-5 ¨ (2009) Modeling monthly evaporation using two different 20. Kis¸ i O neural computing techniques. Irrig Sci 27:417–430. doi:10.1007/ s00271-009-0158-z 21. Kumar M, Bandyopadhyay A, Raghuwanshi NS, Singh R (2008) Comparative study of conventional and artificial neural networkbased ETo estimation models. Irrig Sci 26:531–545. doi:10.1007/ s00271-008-0114-3 22. Kumar M, Raghuwanshi NS, Singh R (2010) Artificial neural networks approach in evapotranspiration modeling: a review. Irrig Sci 29:11–25. doi:10.1007/s00271-010-0230-8 23. Landeras G, Ortiz-Barredo A, Lo´pez JJ (2008) Comparison of artificial neural network models and empirical and semi-empirical equations for daily reference evapotranspiration estimation in the Basque Country (Northern Spain). Agric Water Manag 95:553–565. doi:10.1016/j.agwat.2007.12.011 24. Rahimi Khoob A (2007) Comparative study of Hargreaves’s and artificial neural network’s methodologies in estimating reference evapotranspiration in a semiarid environment. Irrig Sci 26:253–259. doi:10.1007/s00271-007-0090-z 25. Sudheer KP, Gosain AK, Ramasastri KS (2003) Estimating actual evapotranspiration from limited climatic data using neural computing technique. J Irrig Drain Eng 129:214–218. doi:10.1061/ (ASCE)0733-9437(2003)129:3(214) 26. Zanetti SS, Sousa EF, Oliveira VP, Almeida FT, Bernardo S (2007) Estimating evapotranspiration using artificial neural network and minimum climatological data. J Irrig Drain Eng 133:83–89. doi:10.1061/(ASCE)0733-9437(2007)133:2(83) 27. Rahimikhoob A (2010) Estimation of evapotranspiration based on only air temperature data using artificial neural networks for a subtropical climate in Iran. Theor Appl Climatol 101:83–91. doi:10.1007/s00704-009-0204-z 28. Tabari H, Hosseinzadeh Talaee P (2012) Multilayer perceptron for reference evapotranspiration estimation in a semiarid region. Neural Comput Appl 23:341–348. doi:10.1007/s00521-012-0904-7 29. Shiri J, Nazemi AH, Sadraddini AA, Landeras G, Kisi O, Fard AF, Marti P (2013) Global cross-station assessment of neurofuzzy models for estimating daily reference evapotranspiration. J Hydrol 480:46–57. doi:10.1016/j.jhydrol.2012.12.006
30. Karimaldini F, Shui L (2012) Daily evapotranspiration modeling from limited weather data by using neuro-fuzzy computing technique. J Irrig Drain Eng 138:21–34. doi:10.1061/(ASCE)IR. 1943-4774.0000343 31. Cobaner M (2011) Evapotranspiration estimation by two different neuro-fuzzy inference systems. J Hydrol 398:292–302. doi:10. 1016/j.jhydrol.2010.12.030 32. Tabari H, Kisi O, Ezani A, Hosseinzadeh Talaee P (2012) SVM, ANFIS, regression and climate based models for reference evapotranspiration modeling using limited climatic data in a semi-arid highland environment. J Hydrol 444–445:78–89. doi:10.1016/j.jhydrol.2012.04.007 33. Deka PC, Prahlada R (2012) Discrete wavelet neural network approach in significant wave height forecasting for multistep lead time. Ocean Eng 43:32–42. doi:10.1016/j.oceaneng.2012.01.017 34. Daubechies I (1990) The wavelet transform, time-frequency localization and signal analysis. IEEE Trans Inf Theory 36:961–1005 35. Liu B, Shao D, Shen X (2007) Reference crop evaportranspiration forecasting model for BP neural networks based on wavelet transform. J Eng J Wuhan Univ 40:69–73 36. Izadifar Z (2010) Modeling and analysis of actual evapotranspiration using data driven and wavelet techniques. Thesis. Department of civil and geological engineering, University of Saskatchewan 37. Falamarzi Y, Palizdan N, Huang YF, Lee TS (2014) Estimating evapotranspiration from temperature and wind speed data using artificial and wavelet neural networks (WNNs). Agric Water Manag 140:26–36. doi:10.1016/j.agwat.2014.03.014 38. Wang WG, Luo YF (2008) Wavelet network model for reference crop evapotranspiration forecasting. In: Proceedings of 2007 international conference on wavelet analysis pattern recognition, ICWAPR’07. pp 751–755 39. Stefa´nsson A, Koncˇar N, Jones AJ (1997) A note on the Gamma Test. Neural Comput Appl 5:131–133 40. Tsui APM, Jones AJ, de Oliveira AG (2002) The construction of smooth models using irregular embeddings determined by a gamma test analysis. Neural Comput Appl 10:318–329. doi:10. 1007/s005210200004 41. Goyal MK, Bharti B, Quilty J, Adamowski J, Pandey A (2014) Modeling of daily pan evaporation in sub tropical climates using ANN, LS-SVR, Fuzzy Logic, and ANFIS. Expert Syst Appl 41:5267–5276. doi:10.1016/j.eswa.2014.02.047 42. Mallat SG (1989) A theory for multiresolution signal decomposition: the wavelet representation. IEEE Trans Pattern Anal Mach Intell 11:674–693. doi:10.1109/34.192463 43. Jang J-SRJ (1993) ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans Syst Man Cybern 23:665–685. doi:10. 1109/21.256541 44. Tabari H, Hosseinzadeh Talaee P, Willems P, Martinez C (2014) Validation and calibration of solar radiation equations for estimating daily reference evapotranspiration at cool semi-arid and arid locations. Hydrol Sci J. doi:10.1080/02626667.2014.947293 45. Meza F, Varas E (2000) Estimation of mean monthly solar global radiation as a function of temperature. Agric For Meteorol 100:231–241. doi:10.1016/S0168-1923(99)00090-8 46. Boger Z, and Guterman H (1997) Knowledge extraction from artificial neural network model. In: IEEE systems, man, and cybernetics conference, Orlando, FL, USA 47. Berry M, Linoff G (1997) Data mining techniques. Wiley, Hoboken 48. Blum A (1992) Neural networks in C??. Wiley, Hoboken
123