Environ Monit Assess (2009) 151:259–264 DOI 10.1007/s10661-008-0267-9
Evaluation of surface water quality characteristics by using multivariate statistical techniques: A case study of the Euphrates river basin, Turkey Cansu Filik İşçen & Arzu Altın & Birdal Şenoğlu & H. Serhan Yavuz
Received: 22 November 2007 / Accepted: 29 February 2008 / Published online: 20 May 2008 # Springer Science + Business Media B.V. 2008
Abstract The surface water quality of the Euphrates river basin in Turkey are evaluated by using the multivariate statistical techniques known as factor analysis (FA) and multidimensional scaling (MDS) analysis. When FA was applied to the water quality data obtained from the 15 different surface water quality monitoring stations, two factors were identified, which were responsible from the 86.02% of the total variance of the water quality in the Euphrates
river basin. The first factor called the urban land use factor explained 44.20% of the total variance and the second factor called the agricultural use factor explained 41.81% of the total variance. MDS technique showed that electrical conductivity (EC), percent sodium (Na%) and total salt are the most important variables causing difference in the water quality analysis. Keywords Euphrates river basin . Factor analysis . Multidimensional scaling . Statistical techniques . Water quality
C. F. İşçen Department of Elementary Education, Faculty of Education, Eskişehir Osmangazi University, Meşelik 26480, Eskişehir, Turkey e-mail:
[email protected] A. Altın (*) Department of Statistics, Faculty of Arts and Science, Eskişehir Osmangazi University, Meşelik 26480, Eskişehir, Turkey e-mail:
[email protected] B. Şenoğlu Department of Statistics, Faculty of Science, Ankara University, 06100 Tandoğan, Ankara, Turkey e-mail:
[email protected] H. S. Yavuz Department of Electric and Electronic Engineering, Eskişehir Osmangazi University, Meşelik 26480, Eskişehir, Turkey e-mail:
[email protected]
Introduction The Euphrates river which is around 2,800 km long is located in the southeastern part of Turkey. It is the longest river in southwestern Asia. It is formed from the Karasu and Murat tributary rivers. The Euphrates river then crosses into Syria flowing southeast and join with the Tigris river at Iraq, it finally reach into the Persian Gulf (Tosun et al. 2007). The average flow of the Euphrates river is 32 bmc/year (billion cubic meters/year) annually. Turkey contributes 90% of the average flow of the Euphrates, and the remaining 10% originates from Syria (Anderson 1986; Beaumont 1992). Approximately, 1,777,000 ha land in Turkey, 800,000 ha land in Syria and 2,500,000 ha land in Iraq are irrigated by the waters of the Euphrates river (Altınbilek 2004).
260
In this study, we are interested in the Turkish part of the Euphrates river basin. It is the largest of 26 basins in Turkey. See Fig. 1 for the hydrological basins in Turkey. To understand the importance of the Euphrates river basin for the Turkish agriculture and economy, it is enough to see the following informations about it. Of the total 26,712,113 ha plain land in Turkey, 18.52% is in the Euphrates river basin. In terms of irrigatable land, 10.95% of the whole 16,222,122 ha irrigatable land stays in this area; 16.27% of the waters carried by the whole rivers in Turkey is belong to the Euphrates river (Sen and Altunkaynak 2002). Approximately 10% of the Turkish population (around 6,910,866 people) lives in this region (Alaton et al. 2004). This increases the importance of the water for these people who live in this region and engage in agricultural activities. It should also be noted that 89 of the 730 dams in Turkey are situated in the Euphrates river basin and about 45.45% of annual total stored water in Turkey is belong to these dams (Alaton et al. 2004). Turkey, it is believed, will be a water scarce country in the future because of overpopulation, drought and environment pollution. Therefore, water quality observations have started to gain great importance nowadays. For this purpose, water quality data are collected regularly from the surface water quality monitoring stations installed in Euphrates river basin to determine the changes in the pollution sources and therefore pollution levels of the rivers and to identify the factors affecting water quality. The purpose of this study is to identify the main components of the water quality and the most
Fig. 1 The hydrological basins in Turkey (Source: Alaton et al. 2004)
Environ Monit Assess (2009) 151:259–264
important variables causing difference in the water quality for the Turkish part of the Euphrates river basin by using multivariate statistical techniques known as FA and MDS analyses. The results of these analysis may provide a crude guideline for officials to identify and prevent the pollution sources in the Euphrates river basin. By this way, a water of appropriate quality can be obtained for the purposes of drinking water supply and irrigation water, etc. (Boyacioglu 2006; Boyacioglu and Boyacioglu 2007).
Materials and methods In this study, the data have been collected monthly by the General Directorate of Electrical Power Resources Survey and Development Administration (EIE) from the 15 surface water quality monitoring stations situated in the Euphrates river basin during the last 34 years’ period till 2005. Thirteen water quality parameters were selected for the statistical analysis. These parameters were pH, electrical conductivity (EC), sodium (Na+), potassium (K+), calcium and magnesium (Ca++ +Mg++), chloride (Cl−), sulphate (SO4), percent sodium (%Na), sodium adsorption ratio (SAR), residual sodium carbonate (RSC), water hardness (Frs), total salt and boron. See the data from Table 1. Note that 8.1285 (pH value for Station 1) represents the mean of the 288 observations obtained in 32 years. The descriptive statistics for the water quality data are given in Table 2.
0.2697 0.1527 0.0871 0.2610 0.2330 0.1149 0.5692 0.1758 0.1989 0.1435 0.1544 0.1682 0.3451 0.1371 0.1251 290.7133 251.2012 178.1700 332.0792 217.8772 289.7906 288.9686 303.8873 189.5160 218.8723 76.8610 231.2862 346.7190 216.5084 77.7051 15.0347 15.9405 13.5458 21.6991 16.9152 18.2011 15.7159 19.8662 11.9500 15.4216 5.2595 15.0739 16.2428 16.5812 5.5429 0.1034 0.0188 0.0004 0.0084 0.0002 0.0151 0.3477 0.0007 0.2332 0.0106 0.0185 0.0000 0.0695 0.0014 0.0381 1.2908 0.7002 0.1578 0.7996 0.1514 0.7751 1.2198 0.7681 0.6506 0.4481 0.2287 0.6198 1.7184 0.2563 0.2109 32.7574 21.2437 6.2321 20.9969 5.4341 22.0429 30.9689 21.3205 21.9215 14.7438 12.8651 19.8814 37.5250 8.9068 11.9109 0.4239 0.7339 0.2728 0.7095 0.5751 1.0884 0.4849 1.1129 0.2685 0.5438 0.0911 0.7095 0.5013 0.5993 0.0829 Station Station Station Station Station Station Station Station Station Station Station Station Station Station Station
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
32 33 17 34 31 15 17 13 19 21 10 15 14 21 5
288 273 192 285 187 183 185 142 219 105 105 182 167 231 70
8.1285 8.0160 8.1133 8.1579 7.9727 8.1611 8.2658 8.2311 8.1685 8.1319 7.8735 8.0886 8.2328 8.1657 8.0800
454.2396 392.5018 278.3906 518.8737 340.4332 452.7978 451.5135 474.8239 296.1187 341.9879 120.0952 361.3846 541.7485 338.2944 121.4143
1.6082 0.8822 0.1848 1.1881 0.1973 1.0402 1.5595 1.0759 0.7241 0.5613 0.1819 0.7580 2.2683 0.3318 0.1637
0.0836 0.0419 0.0224 0.0449 0.0201 0.0392 0.1322 0.0368 0.0825 0.0498 0.0467 0.0510 0.1122 0.0219 0.0409
3.0069 3.1881 2.7092 4.3398 3.3830 3.6402 3.1432 3.9732 2.3900 3.0843 1.0519 3.0148 3.2486 3.3162 1.1086
1.2900 0.7422 0.2251 0.9802 0.3361 0.9656 0.8690 0.8325 0.3167 0.3295 0.1630 0.5697 1.8867 0.2476 0.1026
SAR %Na Year Number of observaton pH
Table 1 Water quality data
EC
Na+
K+
Ca++ +Mg++ Cl−
SO4
RSC
Hardness (Frs0) Total salt Boron
Environ Monit Assess (2009) 151:259–264
261
Results and discussion In this study, we will use the well known multivariate statistical techniques FA and MDS in the analysis of the water quality data. FA was used to understand the correlation structures between the water quality variables and to combine them in groups to reduce the dimensions of the variables for making decisions and interpretations easier. MDS analysis was used to explore the similarities or dissimilarities (distances) in water quality between the surface water quality monitoring stations in the Euphrates river basin. For more details, please refer to (Johnson and Wichern 2002; Kruskall and Wish 1978). In our study, we used SPSS (ver.13.0) (SPSS 2006) to make the analyses. Factor analysis The particular problem in the case of water quality monitoring is the complexity associated with analyzing the large number of measured variables (Saffran 2001). Therefore, in this study, FA which is a well known data reduction technique is utilized to extract the main components of the water quality data. In other words, FA is used to obtain a smaller number of variables for the evaluation of surface water quality. From the results of the FA, the first two eigenvalues were found to be bigger than 1 and the third eigenvalue was found to be slightly less than 1. See Fig. 2 for the Scree Plot given below. According to the Fig. 2 and a subsequent interpretation of the factor loadings, the first two components were extracted and the other components have been eliminated. This means that majority of the total variance of the original data has been explained by the first two factors. Then we used factor rotation (Varimax) to obtain readily interpretable factor loadings (Johnson and Wichern 2002). Table 3 shows the proportion of total variance explained by the first two factors for both rotated and unrotated factor loadings. It is clear that 44.201% and 41.814% of the total variance of the water quality data are explained by the first and the second components, respectively. While the first two components explain about 86.015% of the total variance, the remaining 11 components only explain 13.985%. The factor loadings for the first two components from the FA analysis of the water quality data are given in Table 4.
262 Variable
Unit
n
Mean
SD
pH EC Na+ K+ Ca++ +Mg++ Cl− SO34 %Na SAR RSC Hardness Total salt Boron
pH units μS/cm mg/l mg/l mg/l mg/l mg/l – mg/l meg/l mg/l mg/l mg/l
15 15 15 15 15 15 15 15 15 15 15 15 15
8.119 365.641 0.848 0.055 2.973 0.657 0.547 19.250 0.666 0.058 14.866 234.010 0.209
0.104 126.577 0.624 0.033 0.900 0.495 0.305 9.497 0.462 0.101 4.500 81.009 0.121
The first factor (F1) is loaded positively with the parameters K+, Na%, SAR, boron, Na, RSC, Cl− and pH which are mainly originated from urban land use. “Urban land use (Na +, K + , Cl −) may be differentiated from other land uses, such as agricultural (Ca++ +Mg++), through the use of biogeochemical fingerprints” (Lindeman 2004; Boyacioglu 2006). It should also be noted that presence of boron in factor F1 shows that borates are used as detergent in that region and are discharged to the Euphrates river. Therefore, F1 is called as “urban land use” factor (Baltaci 2000). The second factor (F2) is positively loaded with parameters hardness, Ca++ +Mg++, SO4 , total salt and EC. Salts that are commonly found in subsurface drainage water include sulphates, chlorides, carbonates and bicarbonates of calcium and magnesium. Tail water also may contain these salts, but generally in much lower concentrations than in drainage water (Jacobsen and Basinal 2004; Boyacioglu 2006). Electrical conductivity (EC) depends on the type and concentration of the ions which are solved in water. As the solved salt concentration increases, EC increases. This conclusion is also supported by the strong positive factor loadings for total salt and EC in F2. F2 has high loading for the parameter Ca++ +Mg++ which is found in agricultural drainage water as mentioned above, and therefore F2 is called as “agricultural use” factor. In summary, urban land pollutant sources and agricultural drainage waters were the main factors affecting the water quality of the Euphrates river basin.
Minimum
Maximum
7.874 120.095 0.164 0.020 1.052 0.103 0.083 5.434 0.151 0.000 5.260 76.861 0.087
8.266 541.749 2.268 0.132 4.340 1.887 1.113 37.525 1.718 0.348 21.699 346.719 0.569
Multidimensional scaling analysis In this section, the water quality characteristics data were analyzed by using two-dimensional and threedimensional MDS analysis to identify similarities and differences between surface water quality monitoring stations. STRESS (STandardized REsidual Sum of Squares), which is used to evaluate how well a particular configuration reproduces the observed distance matrix, was found to be 0.00008 even for two-dimensional MDS. Therefore in the rest of the paper we used the results of the two-dimensional MDS analysis. Because, STRESS values close to zero shows that the “fit” is almost perfect and the results of the MDS analysis is reasonable and reliable. Again all the analysis were done in SPSS version 13.0. See Table 5 for the coordinate values of the 15 surface water quality monitoring stations. Now, we need to determine which two stations have the most differences and which two stations 8 7 6 Eigenvalue
Table 2 Descriptive statistics for the water quality data
Environ Monit Assess (2009) 151:259–264
5 4 3 2 1 0 1
2
3
4
5
6
7
8
9
10
11
12
13
Component Number
Fig. 2 Scree plot of eigenvalues versus components for the water quality data
Environ Monit Assess (2009) 151:259–264
263
Table 3 Total variance explained before and after Varimax rotation Component Initial Eigenvalues Total 1 2 3 4 5 6 7 8 9 10 11 12 13
Extraction sums of squared loadings
% of Variance Cumulative % Total
7.903 60.794 3.279 25.221 0.943 7.257 0.462 3.551 0.255 1.962 0.109 0.838 0.032 0.245 0.015 0.115 0.001 0.010 0.001 0.005 0.000 0.002 0.000 0.000 0.000 0.000
60.794 86.015 93.272 96.823 98.785 99.623 99.869 99.984 99.993 99.998 100.000 100.000 100.000
% of Variance Cumulative % Total
7.903 60.794 3.279 25.221
have the least differences. Lets find the most dissimilar stations among 15 surface water quality monitoring stations according to Dimension-1 and Dimension-2 via MDS analysis. It is clear from the coordinate values given in Table 5 that S11 (Station 11) and S13 (Station 13) are the most dissimilar stations according to the Dimension-1. Since the distance between S11 and S13 is the largest among the other distances. The variables causing dissimilarity between these two stations can be seen from Fig. 3.
Table 4 Factor loadings (Varimax rotation) rotated component matrix Parameters
60.794 86.015
% of Variance Cumulative %
5.746 44.201 5.436 41.814
44.201 86.015
It is clear from Fig. 3 that EC, Na% and total salt are the most important variables causing this dissimilarity. Similarly, S5 and S13 are the least similar stations according to Dimension-2. This results from the same variables causing difference between S11 and S13 (i.e., EC, Na% and total salt), see Fig. 4. In summary, EC, Na% and total salt are the most important variables causing differences in water quality among surface water quality monitoring stations in Euphrates river basin.
Table 5 Coordinate values of the surface water quality monitoring stations in MDS analysis
Component Stations
PH EC Na K Ca++ +Mg++ Cl− SO4 Percent Na SAR RSC Hardness Total salt Boron
Rotation sums of squared loadings
1
2
0.541 0.456 0.831 0.986 0.046 0.678 −0.130 0.888 0.872 0.792 0.046 0.456 0.846
0.505 0.885 0.502 −0.077 0.963 0.592 0.897 0.312 0.416 −0.258 0.963 0.885 0.089
Extraction method: Principal component analysis. Rotation method: Varimax with Kaiser Normalization, a Rotation converged in three iterations
S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 S12 S13 S14 S15
Dimensions 1
2
1.0265 0.3106 −1.0110 1.7681 −0.2956 1.0063 0.9945 1.2602 −0.8007 −0.2743 −2.8344 −0.0487 2.0370 −0.3189 −2.8195
−0.0919 −0.0053 0.0845 0.0612 0.1251 0.0166 −0.0743 0.0364 −0.0595 0.0347 −0.0609 −0.0068 −0.0999 0.0910 −0.0507
Environ Monit Assess (2009) 151:259–264
600 500 400 300 200 100 0
S11
Boron
T. salt
RSC
Variables
Hardness
SAR
%Na
SO4
Cl
Ca+Mg
K+
EC
Na+
caused from urban wastewater and agricultural drainage waters in Euphrates river basin.
pH
Values
264
S13
Fig. 3 Comparison of S11 and S13 stations with respect to Dimension-1
Conclusion
Variables
S5
Boron
T. salt
RSC
Hardness
SAR
%Na
SO4
Cl
Ca+Mg
K+
EC
Na+
600 500 400 300 200 100 0
pH
Values
In this study, the surface water quality in the Turkish part of the Euphrates river basin was evaluated using the well known multivariate statistical methods FA and MDS. Two factors explaining the 86.015% of the total variance in the water quality data were identified. They were termed as the urban land use factor and the agricultural use factor. The urban land use factor explained 44.201% and the agricultural use factor explained 41.814% of the observed variance in the water quality data. The results of the FA analysis showed that urban wastewater and agricultural drainage waters were the main sources of the contamination in the Euphrates river. According to the Dimension 1, S11 and S13 were found to be the most dissimilar stations and the most important variables causing this difference were EC, Na% and total salt. While EC level and the solved salt concentration level were low in S11, they were high in S13. Similar statements were also true for S5 and S13. These results may provide a basis for taking preventive action to reduce the pollution sources
S13
Fig. 4 Comparison of S5 and S13 stations with respect to Dimension-2
References Alaton, I. A., Eremektar, G., Torunoglu, P. O., Gurel, M., Ovez, S., Tanık, A., et al. (2004). Situation of urban waste water treatment plants in Turkey – A step towards promoting sustainable wastewater management. Marrakech: IWA, World Water Congress and Exhibition. Altinbilek, D. (2004). Development and management of the Euphrates–Tigris basin. Water Resources Development, 20 (1), 15–33. Anderson, E. W. (1986). Water geopolitics in the Middle East: Key countries. Conference on U.S. foreign policy on water resources in the Middle East: Instrument for peace and development (pp 18–19), CSIS: Washington DC. Baltaci, F. (2000). Su Analiz Metotları. İçmesuyu ve Kanalizasyon Dairesi Başkanlığı. Ankara: DSI Yayınları. Beaumont, P. (1992). Water: A resource under pressure. in G. Nonneman (Ed.), “The Middle East and Europe: An integrated communities approach", Federal Trust for Education and Research, Second Edn (pp 183–188), London. Boyacioglu, H. (2006). Surface water quality assessment using factor analysis. Water SA, 32(3), 389–393. Boyacioglu, H., & Boyacioglu, H. (2007). Surface water quality assessment by environmetric methods. Environmental Monitoring and Assessment, 131, 371–376. Jacobsen, T. & Basinal, L. (2004). A landowner’s manual. A guide for developing integrated on-farm drainage management systems. California State Water Resources Control Board. Retrieved 25 Sept 2007 from ari.calstate.edu/ research/pdf/00-1-004/FinalReport-00-1-004.pdf. Johnson, R. A., & Wichern, D. W. (2002). Applied multivariate statistical analysis (5th ed.). Upper Saddle River, NJ: Prentice Hal. Kruskal, J. B., & Wish, M. (1978). Multidimensional scaling. Newbury Park, CA: Sage Publications. Lindeman, M. A. (2004). Exploring the effects of urban and agricultural land use on surface water quality. 2004 Denver annual meeting. Paper No. 72-9. Geological Society of America Abstracts with Programs, 36, 184. Saffran, K. (2001). Canadian water quality guidelines for the protection of aquatic life, CCME water quality Index 1,0. User’s manual. Excerpt from publication No. 1299, ISBN 1-896997-34–1. Sen, Z. & Altunkaynak, A. (2002). Susuz Toplumlar için Su. 22 Mart Dünya Su Günü Kitapçığı. İstanbul: Su Vakfı Yayınları. SPSS-13 (2006). Statistical Package for the Social Sciences. Chicago, USA: SPSS. Tosun, H., Zorluer, I., Orhan, A., Seyrek, E., Savas, H., & Turkoz, M. (2007). Seismic hazard and total risk analyses for large dams in Euphrates basin, Turkey. Engineering Geology, 89, 155–170.