Rock Mech Rock Eng (2009) 42:939–946 DOI 10.1007/s00603-008-0012-2 TECHNICAL NOTE
Estimating Rock Cuttability using Regression Trees and Artificial Neural Networks Bulent Tiryaki
Received: 30 January 2007 / Accepted: 29 May 2008 / Published online: 5 July 2008 Ó Springer-Verlag 2008
Keywords
Rock cuttability Regression trees Artificial neural networks
1 Introduction Cuttability of rocks is one of the most significant parameters in mechanical excavation operations. It is determined in linear rock cutting tests using standard chisel picks on a rig. Rock cuttability is usually expressed by cutting specific energy (SE) consumed during rock cutting, which is defined as the energy required for cutting the unit volume of rock material. SE is known closely related to various intact rock properties, which is widely used to predict the performance of mechanical excavators (Fowell and Johnson 1982; Fowell and Pycroft 1980). Traditionally, statistical methods of multiple linear or nonlinear regression techniques are employed to establish predictive models for SE to use in the absence of a rock cutting rig for direct determination of SE. In multiple nonlinear method of establishing predictive models, the form of the relationship between dependent (response) and independent (predictor) variables is supposed to be known before building a predictive model. If this relationship is unknown, a more nonparametric type of regression fitting approach is needed. One such approach is based on the regression tree that approximates a regression relationship using a decision tree. However, this technique has not been employed in building predictive models for SE. Artificial neural networks (ANN) have been used for building predictive models in mining and tunneling applications in the last few years (Kahraman et al. 2006; Benardos and Kaliampakos 2004; Singh et al. 2001; Meulenkamp and Alvarez B. Tiryaki (&) Department of Mining Engineering, Hacettepe University, Beytepe, 06800 Ankara, Turkey e-mail:
[email protected]
123
940
B. Tiryaki
Grima 1999). However, ANN has not yet been applied for building predictive models for SE. This paper is concerned with establishing prediction models for rock cuttability employing intact rock properties. For this purpose, rock cutting data are evaluated using principal components analysis in order to find out the outlying data points. Then regression trees method and ANN are used to develop predictive models for SE. Development of new prediction models is given, and outputs of these models are discussed in this paper.
2 Data Analysis Statistical analyses on building predictive models for SE in this study are based on the data obtained from three different projects undertaken by McFeat-Smith and Fowell (1977), Roxborough and Philips (1981) and Tiryaki and Dikmen (2006). All statistical analyses in this study have been carried out by using MATLAB R14 software. Results of the descriptive statistical analysis performed on the original data set are given in Table 1. 2.1 Outlier Detection Outlying measurements in a data set can adversely influence the clear understanding of the relationships among the variables in the data sets by standing out of the main cluster of data points. These influential data points sometimes indicate the existence of the natural groups in the data set. Principal components analysis is used to detect the outlying data points and to find out the possible natural groups in the data set, which helps generate a more homogeneous data set for predictive model development works (MATLAB 2006; Middleton 2000). The method generates a new set of variables, called principal components. Each principal component is a linear combination of the original variables. A principal components analysis has been applied on the original data set to detect the outlying data points that adversely affect the accuracy of the predictive
Table 1 Basic descriptive statistics for the variables in the original data set Quartz
Density Porosity UCS
Minimum values
0
2.24
0
Maximum values
99
2.96
22.9
Average
46.04 2.62
8.83
87.13
Median
44.05 2.62
8.1
72
SD
37.61 0.11
6.33
62.85
Variance
1414.6
0.01
40.06
7 314
3950.6
BTS
Elasticity Shore CI
SE
1
5.6
19
1.3
4.3
21.3
80.6
57
27.8
52.61
6.35
31.13
38.82
5.63
6.6
27.1
38
4.2
20.5
3.47
19.78
8.34
4.53
10.69
12.03 391.18
69.47
20.49
114.22
19.23
Quartz percentage of quartz (%), Density dry density (gr/cm3), Porosity effective porosity (%), BTS Brazilian tensile strength (MPa), UCS uniaxial compressive strength (MPa), E static modulus of elasticity (GPa), CI NCB cone indenter hardness, Shore shore hardness, SE cutting specific energy (MJ/m3)
123
Estimating Rock Cuttability using Regression Trees
941
models. It can easily be seen from the Fig. 1 that all six non-sandstone coal measures rocks in the original data set are standing out of the main cluster of the rocks in addition to Dolerite and High coal sill. It has been evaluated that nonsandstone coal measures that are taken from McFeat-Smith and Fowell (1977) behave differently from the rest of the data set mainly because those argillaceous rocks are the finest-grained rocks in the data set of McFeat-Smith and Fowell (1977). Dolerite is the only very high strength rock according to the Deere and Miller (1966) classification in the original data whereas the rest of the rocks are identified as very low to high strength rocks. However, no physical evidence has been found to explain why High coal sill (DWG/3) is standing out of the main cluster of the rocks. A separate outlier analysis has also been carried out on the original data set by taking the observations greater than three times the standard deviations for each variable as outliers, which has shown that rock samples L8B, Dolerite, and Nattrass Gill Hazle2 (WT2/12) have one outlying data point for at least one variable. Sample L8b has an outlier for Density. Outlier analysis has not identified an outlying data point for any variable for High coal sill. Dolerite has outliers for Density, UCS, BTS, and CI. Nattrass Gill Hazle2 (WT2/12) has the highest SE value. 2.2 Bivariate Correlation and Curve Fitting All six non-sandstone coal measures, Dolerite, and Nattrass Gill Hazle2 (WT2/12) have been removed from the original data set based on the outlier detection studies
Fig. 1 Scatterplot of second principal component against the first one
123
942
B. Tiryaki
to form the regression data set on which the further statistical analyses are carried out. Correlation coefficients (r-values) between all variables are given in Table 2. Correlation matrix of the original data set exhibits close correlations between Density, UCS, BTS, Elasticity, Shore, CI, and SE. In the further stage of statistical analyses, rock properties that have been found to be in statistically significant correlations with SE have been subjected to curve fitting. According to both the graphical and numerical measures of the goodness of fits found in the curve fitting studies, it has been understood that the most of the changes in SE values are successfully expressed by UCS, BTS, Elasticity, and CI, individually. Therefore, these independent variables have been used as predictors in predictive models of SE in this study.
3 Artificial Neural Networks A two-layered feed-forward back propagation network has been chosen to build the ANN model for predicting SE in this study. The input layer has four neurons corresponding to four above-mentioned predictors in the prediction model. The hidden layer has tangent sigmoid transfer function neurons. The neuron numbers for hidden layers is selected as three. The ANN has been trained and has been implemented by using back propagation with the Levenberg–Marquardt algorithm on the regression data set. The correlation coefficient between observed and predicted SE values based on the ANN model is 0.87 (Fig. 2).
4 Regression Trees In the regression trees technique, the data set is divided into different regions, using the values of the predictor variables so that the response variables are roughly constant in each region. A regression tree is a sequence of questions that can be answered as ‘yes’ or ‘no,’ and a set of fitted response values. It can be assumed as a Table 2 Full correlation matrix for the regression data set Quartz Quartz
1
Density
-0.34*
Porosity
0.63*
Density -0.34*
Porosity
UCS
BTS
Elasticity
Shore
CI
SE
0.63*
-0.18
-0.3*
-0.45*
-0.01
-0.14
-0.04
1
-0.17
0.5*
-0.17
1
-0.07
0.62* -0.23
0.53*
0.3
0.47*
-0.31*
-0.44*
-0.03
0.36* -0.08
UCS
-0.18
0.5*
-0.07
1
0.83*
0.87*
0.31*
0.7*
0.67*
BTS
-0.3*
0.62*
-0.23
0.83*
1
0.77*
0.45*
0.74*
0.57*
Elasticity
-0.45*
0.53*
-0.31*
0.87*
0.77*
1
0.33*
0.6*
0.63*
Shore
-0.01
0.3
-0.44*
0.31*
0.45*
0.33*
1
0.44*
0.48*
CI
-0.14
0.47*
-0.03
0.7*
0.74*
0.6*
0.44*
1
0.72*
SE
-0.04
0.36*
-0.08
0.67*
0.57*
0.63*
0.48*
0.72*
1
* Correlation coefficient is significant at 0.05 level
123
Estimating Rock Cuttability using Regression Trees
943
Fig. 2 Scatterplot of SE values predicted by ANN model versus those observed
simplified form of a schematic view of a set of if-then questions aiming to estimate the response value for a given set of predictor values. Each question asks whether a predictor satisfies a given condition. Depending on the answers to one question, it either is proceeded to another question or is arrived at a fitted response value. If the answer is ‘yes’ to a particular question, the left branch is taken to proceed (MATLAB 2006; Breiman et al. 1984). Mathematical foundations of this technique can be found in Breiman et al. (1984). The regression tree shown in Fig. 3 has been calculated for the regression data set by MATLAB using UCS, BTS, Elasticity, and CI as predictors. Rational expressions that are associated with the triangles in Fig. 3 correspond to the questions that must be answered for the target rock to estimate its SE depending on rock properties. The regression tree in Fig. 3 starts with questioning the Elasticity value of the target rock. If it is less than 30.4 GPa, the left branch of the regression tree is taken to answer the question about if CI value of the target rock is less than 2.2. If it is less than 2.2, the terminal node that is indicated by a black circle is arrived, which is associated with a number 6.1144. This means that the SE value of the target rock is 6.1144 MJ/m3. If the CI value is higher than 2.2 (while Elasticity is less than 30.4 GPa), SE can be either 12.5693 or 19.398 MJ/m3 depending on the BTS value of the target rock. A scatterplot of the fitted SE values against the observed SE values for the regression tree model is given in Fig. 4. The correlation coefficient between the observed and predicted SE values based on the regression tree model is 0.97. This value is greater than that for the ANN model of SE. However, the correlation coefficient values for the relation between the observed and predicted values do not
123
944
B. Tiryaki
Fig. 3 Regression tree chart for predicting SE
Fig. 4 Scatterplot of SE values predicted by regression tree model versus those observed
necessarily identify the most appropriate model. The sum of squares due to error statistics (SSE) that also corresponds to the error variance of the prediction is a better criterion for deciding which model fits the same data set better than the others do. The SSE value for the regression tree model is 168.5, which is smaller than that for the ANN model (621.12). These results indicate that regression tree model is a better predictor of SE than the ANN model. In addition, an analysis of variance (ANOVA) test has been carried out on observed and predicted SE values for comparing the means of those values to
123
Estimating Rock Cuttability using Regression Trees
945
Table 3 ANOVA results Source of var. Columns
Sum of Sq. (SS)
df
SS/df
0.08
2
0.0404
Error
7223.55
108
66.8847
Total
7223.63
110
F-stat.
Prob. F (P value)
0
0.9994
determine whether the observed SE values are equal to those predicted by ANN and regression trees techniques. Results of ANOVA have been given in Table 3 and Fig. 5. The P value that is very close to one indicates that differences between the means are not statistically significant. This means that all samples are drawn from the same population (or from different populations with the same mean), which shows that both techniques that are used to predict SE perform well. The boxplots shown in Fig. 5 also confirm this graphically. When the respective differences between the centerlines of observed SE boxplot and the other two are considered in Fig. 5, it can be said that regression tree performed better than ANN in predicting SE for this data set since larger differences in the centerlines of the boxplots correspond to larger values of F and correspondingly smaller P values. 5 Conclusions The following results and conclusions can be drawn from the present study of the applications of regression trees method and ANN for building predictive models of SE in linear cutting of rock by drag tools:
Fig. 5 Boxplots of observed SE values and those predicted by ANN and the regression tree
123
946
1.
2.
3.
B. Tiryaki
Results of the principal components analysis have shown that all the nonsandstone coal measures from McFeat-Smith and Fowell (1977) are clustered in the lower left corner of the plot in Fig. 1 whereas Dolerite stands out in the upper right corner of the same plot mainly due to their petrographic, mineralogical, and physical properties. Bivariate correlation and curve fitting analyses have revealed that UCS, BTS, Elasticity, and CI can be individually used in predicting SE for the regression data set. Two predictive models of SE have been developed by using regression trees approach and ANN. UCS, BTS, Elasticity, and CI have been employed as predictors for building the models. SSE values and ANOVA results have revealed that the regression tree model fit the data better than the ANN model, indicating its significance in predicting standard rock cuttability. Model built by the regression trees method can easily be used to determine the field estimates for SE by mining and civil engineers.
References Benardos AG, Kaliampakos DC (2004) Modelling TBM performance with artificial neural networks. Int J Rock Mech Min Sci 19:597–605 Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth & Brooks/Cole Advanced Books & Software, Monterey Deere DV, Miller RP (1966) Engineering classification and index properties for intact rock. Report AFWL-TR–65-116 Air Force Base, New Mexico, p 308 Fowell RJ, Johnson ST (1982) Rock classification and assessment for rapid excavation. In: Proceedings of the symposium on strata mechanics, The University of Newcastle Upon Tyne, 5–7 April, pp 241– 244 Fowell RJ, Pycroft AS (1980) Rock machinability studies for the assessment of selective tunnelling machine performance. In: Proceedings of the 21st national rock mechanics symposium, USA, pp 149–158 Kahraman S, Altun H, Tezekici BS, Fener M (2006) Sawability prediction of carbonate rocks from shear strength parameters using artificial neural networks. Int J Rock Mech Min Sci 43:157–164 MATLAB (2006) Statistics Toolbox for use with MATLAB, User’s Guide Version 5. The MathWorks, Inc McFeat-Smith I, Fowell RJ (1977) Correlation of rock properties and the cutting performance of tunnelling machines. In: Proceedings of a conference on rock engineering, Newcastle Upon Tyne, England, pp 581–602 Meulenkamp F, Alvarez Grima M (1999) Application of neural networks for the prediction of the unconfined compressive strength (UCS) from Equotip hardness. Int J Rock Mech Min Sci 36:29–39 Roxborough FF, Philips HR (1981) Applied rock and coal cutting mechanics. Workshop course no. 156/ 81. Australian Mineral Foundation, Adelaide, 11–15 May Middleton GV (2000) Data analysis in the earth sciences using MATLAB, Prentice Hall Singh VK, Singh D, Singh TN (2001) Prediction of strength properties of some schistose rocks from petrographic properties using artificial neural networks. Int J Rock Mech Min Sci 38:269–284 Tiryaki B, Dikmen AC (2006) Effects of rock properties on specific cutting energy in linear cutting of sandstones by picks. Rock Mech Rock Eng 39(2):89–120
123