APPLICATIONS
Design In--Information Out Features of a good experimental design are illustrated through a tool-life example
by W. J. Beggs and H. Ginsburg
ABSTRACT-~In this expository paper, both statistical and engineering considerations preliminary to the choice of a design are discussed, and the roles of the statistician and experimenter in both the design and analysis of experiments are described.
Introduction: Probability and Statistics The title of this expository paper, "'Design In Information Out", is intended to be the antithesis of the popular p h r a s e "Garbage I n - - G a r b a g e Out" w h i ch c o m p u t e r p r o g r a m m e r s h a v e m a d e f a m i l i a r to us all. The purpose of this paper is to indicate the advantages of consciously considering the statistical implications i n v o l v e d in experimentation. Statistics is the science and art of dealing w i t h u n c e r t a i n t y and variability. Since, even u n d e r the best of circumstances, a repeated e x p e r i m e n t u n d e r t h e same set of conditions will not yield the same data, statistics is i n v o l v e d in all experimentation. It has been stated that a person d r a w i n g conclusions f r o m data cannot choose b e t w e e n using statistics and not using statistics. H e is using statistical m et h o d s w h e t h e r he realizes it or not, and his real choice is b e t w e e n good statistical procedures and poor ones. Similarly, a person w h o plans the collection of data to m a k e inferences is engaged in an area of statistics k n o w n as the design of e x p e r i ments. Again, the choice is b e t w e e n r e l a t i v e l y efficient and r e l a t i v e l y inefficient statistical procedures. Our hope is that this point w i l l be m a d e clear in this paper. Let us briefly describe the areas of statistics that are most r e l e v a n t to experimentation. In fact, since most people use the t e r m s " p r o b a b i l i t y " and "statistics" interchangeably, a good place to start is to contrast the emphasis of these two r e l a t e d subjects.
W. 1. Beggs and IT. Ginsbur~ are associated with Bettis Atomic Power Laboratory, Westinghouse Electric Corporation, W e s t Mifflin, Pa. 15122. Paper was presented at 1971 SESA Spring Meeting held in Salt Lake City, Utah on May 18-21.
Confusion is u n d e r s t a n d a b l e since most e l e m e n t a r y books on s~atistics include sections on p r o b a b i l i t y ; f u r t h e r m o r e , p r o b a b i l i t y is often described as a branch of m a t h e m a t i c s used to ex p l ai n r a n d o m n e s s and u n c e r t a i n t y in nature. F r o m t h a t point of view, it's not clear h o w it differs f r o m statistics. R a t h e r than appeal to definitions, let's e x a m i n e t h e n a t u r e of the two subjects. P r o b a b i l i t y is d ed u ct i v e in nature. It starts w i t h a k n o w n population w i t h k n o w n distribution ( m a t h ematical r ep r esen t at i o n ) and deduces v ar i o u s c h a r acteristics of r a n d o m samples d r a w n f r o m this p o p u lation. A l t h o u g h the w o r d population u su al l y r e f e r s to people, we are using this w o r d in the statistical sense as a collection or any set of items (or n u m bers). Thus, if a process produces units w i t h a k n o w n proportion of defectives, then one can easily deduce f r o m p r o b a b i l i t y t h e o r y the chances of obtaining no defectives in a r a n d o m sample of n units. Similarly, if the two parameters, specifically the m e a n and the variance, w h i ch u n i q u e l y specify a n o r m a l or Gaussian distribution are known, then one can deduce, for example, the b e h a v i o r of the smallest observation in samples of size 3. In general, if the distribution of a population and all t h e p a r a m eters of the distribution are known, then the subject of p r o b a b i l i t y addresses itself to the deduction of the characteristics of r a n d o m samples d r a w n from the population. N o w statistics (or statistical inference) r e p r e s e n t s the "other side of the coin". That is, it uses the i n f o r m a t i o n contained in a sample (a " p a r t " ) to m a k e inferences about the entire population (the " w h o l e " ) f r o m w h i c h the sample came. If w e w a n t to add adjectives to the w o r d statistics, m a t h e matical statistics addresses itself to p r o b l e m s of e x t r a c t i n g t h e m a x i m u m i n f o r m a t i o n contained in a sample in some optimal w a y (and t h e r e are m a n y reasonable ways of defining o p t i m a l ) . E x p e r i m e n t a l statistics begins one step back and deals w i t h the problems of efficient m et h o d s of collecting the sample
Experimental Mechanics ] 289
APPLICATIONS
data. In this sense, the construction of an e x p e r i m e n t a l p l a n or e x p e r i m e n t a l design can be thought of as p a r t of e x p e r i m e n t a l statistics. H o w e v e r , since t h e "best" m e t h o d of data analysis is dependent upon t h e m a n n e r in w h i c h the data w e r e collected, e x p e r i m e n t a l p r o b l e m s i n v o l v e both e x p e r i m e n t a l and m a t h e m a t i c a l statistics. As a g e n e r a l statement on e x p e r i m e n t a t i o n , a r e l a t i v e l y inefficient m e t h o d of data analysis f r o m a w e l l - d e s i g n e d e x p e r i m e n t is m o r e i n f o r m a t i v e t h a n the most efficient m e t h o d of data analysis f r o m a poorly designed experiment. F o r example, an e x p e r i m e n t e r r e c e n t l y asked for assistance in analyzing and i n t e r p r e t i n g his data. He stated that he h a d collected his data for the purpose of d e t e r m i n i n g t h e effects of two variables, say x l and x~, on strength. He noted that x l was v a r i e d f r o m A to B and xz was v a r i e d f r o m C to D. A plot of t h e data appears in Fig. 1. The n u m b e r s on the g r a p h are the coded s~rengths. A l t h o u g h it's clear that strength increased b o t h w i t h x l and x2, t h e data cannot r e v e a l h o w m u c h of the increase was due to each variable. A m o r e efficient and i n f o r m a t i v e patt e r n of combinations of xl and x2 will be illustrated l a t e r on.
Statistical Inference G e n e r a l l y speaking, methods of statistical inference encompass estimation and tests of hypotheses. These methods are r e l a t e d in t h a t they are based on similar theory, but differ in t h e ir approach.
Estimation It's c o n v e n i e n t to talk about estimation under the categories of point estimation and i n t e r v a l estimation. Point estimation refers to estimating one or m o r e " p a r a m e t e r s " of a population, or estimating the p a r a m e t e r s in a m a t h e m a t i c a l model, based upon a r a n d o m sample of data. F o r example, estimating t h e t r u e carbon content of a heat of steel, or the r o o m - t e m p e r a t u r e tensile properties of a particular alloy, or the v a r i a t i o n in certain m a g n e t i c properties of coils of steel are all e x a m p l e s of point estimation. That is, one n u m b e r is g e n e r a te d to r e p resen t one p a r a m e t e r . If w e ' r e interested in estimating some characteristic of a heat, say, we do this by e x a m i n ing a r a n d o m sample f r o m this population. The data in the r a n d o m sample are not an end in t h e m selves, b u t s e r v e as a v e h ic le for m a k i n g inferences about t h e entire heat. This is obviously t r u e w h e n e v e r we use d e s tr u c t iv e testing. A l t h o u g h the r e s u lt in g point estimates are of interest to the e x p e r i m e n t e r , he r e a l l y wants some m eas u re of the " r e l i a b i l i t y " of the estimates. Th er e is an old story of a r e s e a r c h e r who was w o r k i n g on a cure for a certain disease in chickens. He r e p o r t e d that w i t h his t r e a t m e n t 33-1/3 percent w e r e cured of the disease and 33-1/3 p e r cen t did not respond to the drug. W h e n questioned b y his colleague about the r e m a i n i n g 33-1/3 percent he replied, " t h at chicken got a w a y ! " In order to i n d i -
290 I ]une 1972
X2
44
D 37 9
5/ 9
32
38 9
34
28 9
,?__T
"42 39
9 9
40
,34
.26
I
I
A
B
XI
Fig. 1.--Response y vs. x~ and x2
cate the reliability of their estimates, most e x p e r i m e n t e r s attach a "q- or - - " figure to their estimates, or state that "the answer is good to within, say, • 1 percent". But w h a t precisely does • 1 pe r c e nt m e a n ? I n t e r v a l estimation is the area of statistical inference that deals w i t h this p r o b l em in an objective, efficient, and u n am b i g u o u s way. Statisticians feel t h a t the techniques of confidence i n t e r vals, tolerance i n t er v al s and prediction intervals are far superior to the mystical "q- or - - " figure w e often see but find so hard to understand.
Hypothesis Testing Th e second b r o ad area of statistical inference is hypothesis testing. This is an i n t e g r a l part of w h a t is g en er al l y called the scientific method. To take a most e l e m e n t a r y and idealized example, suppose that an e x p e r i m e n t e r is led to believe that a p a r ticular modification in the process will i mpr ove some characteristic of interest. In this o v er s i mpl i fication, we are ignoring costs and are ignoring the fact that he is interested in several characteristics, such as strength, corrosion resistance, etc. Let us call t h e present process "A " and the modified process "B" and the characteristic of interest "yield". In other words, he believes that, if he switches to process B, the yield will be increased. Of course, he is w e l l a w a r e of the fact, before he starts the experiment, that if he prepares two groups of test specimens u n d e r process A, the observed sets of yields of both groups will be n u m e r i c a l l y different due to e x p e r i m e n t a l error ( i n h er en t material v a r i a tion, v ar i at i o n in testing e n v i r o n m e n t , m e a s u r e m e n t error, etc.). Thus, w h e n he obtains the test results f r o m process A and process B, he must decide w h e t h e r the observed difference is due m e r e l y to chance v a r i a t i o n or w h e t h e r t h e observed difference is indicative of a true difference in the processes. If he decides that the observed difference in yields is indicative of a real difference b et w een the processes when, in truth, the processes are equivalent, he commits an error. On the other hand, if he ascribes the observed difference to random variation
APPLICATIONS
when, in fact, the two processes are not e q u i v a l e n t in yield, he commits an error. Hopefully, he usually makes the correct decision. If the e x p e r i m e n t e r is f o r t u n a t e enough to be working in an area of technology where the e x p e r i m e n t a l error is small, where e x p e r i m e n t a l costs are low, and where the true difference in yield b e t w e e n the processes is large, the relative frequency of his reaching the w r o n g conclusion will indeed be very low. A m o n g our exp e r i m e n t e r friends, however, most are not in the enviable position just described. Statistics cannot eliminate these potential decision errors. However, through formal statistical methods, the m a g n i t u d e s of these two decision errors can be n u m e r i c a l l y assessed and controlled, at least for the simpler experiments 9 To help illustrate the logic of h y pothesis testing, consider the analogy with our legal system. The potential decision errors are either that of acquitting a t r u l y guilty person or convicting a t r u l y innocent person, and the "data" consist of the evidence collected. It should now be clear that the objectives of a n y e x p e r i m e n t a l program, however complex, m a y be translated into estimating certain parameters or functions of these parameters (the m a t h e m a t i c a l model) a n d / o r testing certain hypotheses about the model.
Preliminary Considerations in the Design of Experiments As previously stated, the field of statistics dealing with the p l a n n i n g of the collection of data is called the design of experiments 9 Some people are quite u n h a p p y with this terminology because they feel it is misleading. If statisticians could literally design experiments in metallurgy, chemical engineering, ceramics and other fields, the engineering schools would all close down and t r a i n i n g in statistics would expand tremendously! However, this view presumes the subject of e x p e r i m e n t a l design to belong exclusively to the statisticians. The more proper view is that the theory of design of experiments should be one of the tools of an engineer or scientist engaged in experimentation.
The Experimenter's Role W h e n one thinks of designing an e x p e r i m e n t a l program, one should immediately ask: "What are the objectives of the e x p e r i m e n t ? " Although it's been said before, a t r u l y explicit statement of the p r o b l e m is one of the most difficult and yet i m p o r t a n t phases of the e x p e r i m e n t a l program. "How broad are the desired inferences to be?", or "What population are we interested in?" For example, if the average weight of people in a room is of interest, we all know that selecting one person at r a n d o m and weighing him ten times is not as informative as selecting five people at r a n d o m and weighing each one once. Yet, in the context of industrial experiments, some people are convinced that ten ob-
servations must be better t h a n five observations, regardless of the e x p e r i m e n t a l design. A n o t h e r i m p o r t a n t question i n a n e x p e r i m e n t a l p r o g r a m is "What controlled or i n d e p e n d e n t variables (factors) should we study in the e x p e r i m e n t ? " That is, if w e ' r e s t u d y i n g the effects of composition and processing variables on a certain alloy, just w h i c h composition and which processing variables should we investigate in the experiment? One can't study everything. A n o t h e r i m p o r t a n t question is, "Of the selected factors, which levels should we study?" That is, if one of the factors is temperature, what r a n g e of t e m p e r a t u r e s should we investigate a n d just how m a n y t e m p e r a t u r e levels should we s t u d y in the e x p e r i m e n t ? And, if we decide on, say, three equally spaced temperatures, should these be equally spaced on a centigrade scale, logcentigrade scale, inverse a b s o l u t e - t e m p e r a t u r e scale, or what? Further, "what criteria or d e p e n d e n t variables should be measured, and u n d e r w h a t conditions?" That is, what specifically constitutes, say, a "good h i g h - t e m p e r a t u r e alloy", and is it justifiable to measure its properties at room t e m p e r a t u r e ? A n o t h e r consideration in e x p e r i m e n t a t i o n is "with what sources of u n c o n t r o l l e d variation should we be c o n c e r n e d ? " - - s u c h as b e t w e e n locations w i t h i n ingots, h e a t - t o - h e a t variation, biases b e t w e e n testing machines, etc.? Unless recognized and dealt with, these sources of v a r i a t i o n become part of the overall e x p e r i m e n t a l error. These are only some of the questions which need to be answered before e m b a r k i n g on the actual e x p e r i m e n t ; and the person knowledgeable in the subject area, i. e., the experim e n t e r himself, is u l t i m a t e l y responsible for the answers 9
The Objectives and the Model After the e x p e r i m e n t e r has gone t h r o u g h the soul-searching procedure in defining the problem, selecting the i n d e p e n d e n t a n d d e p e n d e n t variables, etc., the statistician attempts to summarize the way a n e x p e r i m e n t e r is looking at the world b y m e a n s of a m a t h e m a t i c a l model. A m a t h e m a t i c a l model is a symbolic representation of our view of the system we are going to investigate. Table 1 contains some illustrations of m a t h e m a t i c a l models 9 The first line in Table 1 is a terse representation of the fact that the true response ~ is a function of a set of i n d e p e n d e n t variables {x} ---- (xl, x2, , Xk}, and a set of u n k n o w n parameters, (0} = (01, 0~. . . . . 0p). Beneath that, appears the familiar model for a straight line, where 0o represents the y - i n t e r c e p t and el the slope. W h e n an e x p e r i m e n t e r postulates this model, he is not i m p l y i n g that he believes that the true relationship b e t w e e n the dep e n d e n t variable and i n d e p e n d e n t variable is linear, but that, over the region of interest, this is an adequate r e p r e s e n t a t i o n of the relationship. (The h i g h - t e m p e r a t u r e - a p p l i c a t i o n s e x p e r i m e n t e r is not i n v i t i n g the cryogenics engineer to extrapolate his fitted line). The n e x t model listed involves two i n 9
.
.
Experimental Mechanics [ 9,91
APPLICATIONS
TABLE 1--MATHEMATICAL MODELS General model: ~ : - f (x; 0) Straight line: ~1 = 00 3- 01 x 2nd order in 2 variables: ~1 - : 00 q- 01 xz 3- 02 x2
3- 03 X123- 84 X22 3- 85 X1 X2 Nonlinear model: n : 1 -- 81 exp (--02 x)
d ep en d en t variables and six u n k n o w n parameters. This s e c o n d - o r d e r m o d e l represents a curved surface. A n e x p e r i m e n t e r a t t e m p t i n g to optimize a process dependent on two m a j o r factors often uses this model. Again, he v ie w s this as an e m p ir i cal model wh i ch a d e q u a t e l y represents the shape of t h e r esponse surface over the region of interest. The final m o d el listed in Table 1 is a nonlinear m o d e l i n v o l v ing one i n d e p e n d e n t v a r i a b l e and two p a r a m e t e r s (as w i t h the straight line). It is t e r m e d n o n l i n ear because the u n k n o w n parameters, 01 and 02, do not enter the m od e l linearly. Using this criterion, polyn o m i n a l models are linear m a t h e m a t i c a l models. N o n l i n e a r models often, but not always, are v i e w e d as t h eo r et i cal or mechanistic models, since t h e y m a y be a result of a theoretical f o r m u l a t i o n of an accepted mechanism. In this context, the p a r a m e t e r s are of interest in themselves--e.g., rate constants in a chemical r e a c t i o n - - i n contrast to empirical models for w h i c h interest is often in predicting the response. No m a t t e r w h i c h m a t h e m a t i c a l model is used, however, the response y is m e a s u r e d with error; i.e., y _ ~1 3- e.
TABLE 2--REPRESENTATION OF TOOL-LIFE MODEL T - : f (v, f, d; 0) T: Tool life in min V: Cutting speed of lathe in f t / m i n f: Feed rate is in in./rev D: Depth of cut in in. {e}: Parameters of the model to be estimated
of t h e three factors was r o u g h l y linear. A typical g r ap h is presented in Fig. 2. A l t h o u g h the t h r e e i n d ep en d en t variables can conceptually be set an any positive level, the tool engineer's e x p e r i e n c e and k n o w l e d g e of the capacity of the lathe and limiting cutting conditions resulted in a region of practical interest for each of the factors. The ranges of interest for each of the three factors are p r esen t ed in Table 3. Note that for each variable, a center point or middle l e v e l was selected so as to p r o v i d e e q u a l spacing of the settings on a log scale r a t h e r than on an arithmetic scale. This is consistent w i t h the i n f o r m at i o n e x h i b i t e d in Fig. 2. The values of each of the variables w e r e transformed, for convenience, so that the high, center and low w o u l d correspond to 3-1, 0, and --1 in coded units, respectively. The t r an sf o r m at i o n s w er e: 2(log v -- log 700) xl=
292 I lune 1972
3-1
2 ( l o g f -- l o g .022)
x~=
( l o g .022 -- l o g .010)
3- 1
2 ( l o g d -- l o g .100)
Features of a Good Design: A n E x a m p l e Let us, now, t u r n to an e x a m p l e to illustrate the features of good e x p e r i m e n t a l - d e s i g n techniques. The p r o b l e m concerns the life of a cutting tool, and was originally r e p o r t e d by S. M. Wu. lo,ll The tool engineers w e r e not only interested in d e t e r m i n i n g the p a r t i c u l a r combination of cutting speed, feed rate and depth of cut w h i c h resulted in longest tool life, but also in d e t e r m i n i n g the n a t u r e of t h e r e sponse surface in the region of the m a x i m u m r e sponse. This is a sensible r e q u i r e m e n t w h e n e v e r one is concerned w i t h the ability to m a i n t a i n a specific setting of a v a r i a b l e under m a n u f a c t u r i n g conditions, the differences in m a n u f a c t u r i n g costs for different settings, etc. A p r e l i m i n a r y r e p r e s e n t a t i o n of the tool-life model, w i t h o u t specifying a functional form, is given in Table 2. This simply states that tool life is a function of the cutting speed of the lathe in surface feet per minute, the tool feed rate in inches per revolution, and the depth of cut in inches. The p a r a m e t e r s in the m o d el are unspecified at this point. Based upon data p r e v i o u s l y collected on "simil a r l y " shaped tools of " s i m il a r " materials, the tool engineer observed that the plot of the n a t u r a l loga r i t h m of tool life vs. the n a t u r a l l o g a r i t h m of each
( l o g 700 -- l o g 330)
x~=
(log .100 -- log .049)
3- 1
Before we can m a k e a rational choice for the conditions of each of the i n d ep en d en t variables to be controlled for each run, we must be more explicit about the n a t u r e of the m a t h e m a t i c a l model. Since l o g - l o g plots of tool life vs. each of the independent v ar i ab l es w e r e a p p r o x i m a t e l y linear, the e x p e r i m e n t e r hoped that a first-order m o d el would suffice, over the r e g i o n of interest. Consider the empirical
LOG LIFE
LOG X Fig. 2--Log life :
a + b log X + error
APPLICATIONS
TABLE 3 ~ R A N G E S OF INTEREST FOR THREE FACTORS Level
v
High Center Low
Variable Settings f d
700 480 330
.022 .015 .010
.100 .070 .049
xl 1 0 --1
Coded Levels x2 x~ 1 0 --1
X2
X3
=1 -I -I
-I -I -I -I -I -I
I) 2) 3)
-I -I -I
4) 5) 6)
I I I
7)
-I
I
8) 9)
-I -I - I
I I - I
-I
-I "1
I0) II) 12)
-I
-I -I -I
log T ~- o0 + #1 log v + e2 log f + as log d n = Po + Pl xl + / ~ 2 x s + / ~ 8 x 3
1 0 --1
m o d e l g i v e n i n T a b l e 4. W h e n t h e i n d e p e n d e n t v a r i a b l e s a r e e x p r e s s e d i n c o d e d u n i t s , it is c l e a r t h a t t h i s model represents a hyperplane, and the unknown p a r a m e t e r s ~1, P2, P8 a r e t h e s l o p e s i n t h e i r r e s p e c t i v e directions. I n c o n s i d e r i n g a d e s i g n f o r t h i s p r o b l e m , i t is c l e a r t h a t a t l e a s t f o u r r u n s a r e r e q u i r e d to e s t i m a t e the four unknown parameters in the model. Addit i o n a l r u n s a r e r e q u i r e d to o b t a i n a n e s t i m a t e of experimental error, obtain evidence with respect to t h e i n a d e q u a c y of t h e p o s t u l a t e d e m p i r i c a l m o d e l , a n d i n c r e a s e t h e p r e c i s i o n of t h e e s t i m a t e s of t h e unknown parameters. F o r a p r o b l e m s i m i l a r to t h e t o o l - l i f e e x p e r i m e n t , t h e d e s i g n i n Fig. 3 w a s r e c e n t l y p r o p o s e d b y a n engineer. Presumably, the experimenter randomized the o r d e r of t h e r u n s i n o r d e r to m i n i m i z e t h e c h a n c e of u n k n o w n effects, s u c h as t i m e t r e n d s , f r o m b e i n g confused with the three variables under study. He p r o c e e d e d , t h e n , to c o m p a r e t h e a v e r a g e of t h e first 3 r u n s w i t h t h e a v e r a g e of t h e s e c o n d 3 r u n s i n o r d e r to o b s e r v e t h e effect o n t h e r e s p o n s e of m o v i n g x l f r o m a l o w l e v e l to a h i g h level, t h e a v e r a g e of t h e first 3 r u n s w i t h t h e a v e r a g e of r u n s 7, 8 a n d 9 to l e a r n a b o u t x2, a n d t h e a v e r a g e o f t h e first 3 w i t h t h e a v e r a g e of t h e l a s t 3 f o r x3. E a c h average was based on three observations and the
RN UO N. Xl
TABLE 4 - - E M P I R I C A L MODEL
v a r i a b i l i t y a m o n g t r i p l i c a t e s w a s u s e d as a m e a s u r e of e x p e r i m e n t a l e r r o r . T h i s o n e - a t - a - t i m e a p p r o a c h to p r o b l e m s of t h i s t y p e s e e m s to b e w i d e l y u s e d b e c a u s e of its i n t u i t i v e a p p e a l a n d e a s e of a n a l y s i s . H o w e v e r , l e t ' s e x a m i n e a d i f f e r e n t c o n f i g u r a t i o n of runs. T h e r u n s i n d i c a t e d i n Fig. 4 a r e t e c h n i c a l l y called a three-factor factorial experiment with each f a c t o r a t t w o l e v e l s , or a 2 3 f a c t o r i a l . O b s e r v e t h a t t h e e i g h t p o s s i b l e c o m b i n a t i o n s of t h r e e v a r i ables, e a c h r u n at t w o l e v e l s , a r e r e p r e s e n t e d . O n e c a n t h i n k of t h e s e as t h e v e r t i c e s of a c u b e . P e r h a p s , o n first sight, t h e s e r u n s d o n o t a p p e a r to h a v e t h e e a s e of i n t e r p r e t a t i o n as t h o s e i n Fig. 3. C e r t a i n l y t h e o n l y d i f f e r e n c e i n r e s p o n s e b e t w e e n r u n s 1 a n d 2 is d u e t o xl, i n a d d i t i o n t o e x p e r i m e n t a l e r r o r . H o w e v e r , t h e s a m e is t r u e of r u n s 3 a n d 4. F u r t h e r m o r e , r u n s 5 a n d 6 s u p p l y a n o t h e r e s t i m a t e of t h e effect of Xl. F i n a l l y , r u n s 7 and 8 have the same feature. Thus, the average of t h e f o u r o d d - n u m b e r e d r u n s vs. t h e a v e r a g e of the four even-numbered r u n s is a v a l i d c o m p a r i s o n of " l o w x l " w i t h " h i g h x l " . A l t h o u g h w e h a v e u s e d o n l y 8 r u n s r a t h e r t h a n 12, e a c h a v e r a g e is b a s e d o n f o u r o b s e r v a t i o n s r a t h e r t h a n 3. F u r t h e r m o r e , w e u s e d all of o u r d a t a to s t u d y Xl as c o m p a r e d to h a l f t h e d a t a i n t h e p r e v i o u s case. W h a t a b o u t the remaining two variables? I n a s i m i l a r f a s h i o n , w e o b s e r v e t h a t r u n 3 is t h e s a m e as r u n 1, e x c e p t f o r x2. S i m i l a r l y , r u n s 2 vs. 4, 5 vs. 7, a n d 6 vs. 8 a r e a l l e s t i m a t e s of t h e effect of c h a n g i n g x2 f r o m its l o w l e v e l t o i t s h i g h level. T h u s , t h e a v e r a g e of t h e r e s p o n s e s to r u n s 1, 2, 5, a n d 6 vs. t h e a v e r a g e of t h e o t h e r h a l f
RUN NO. +
-I -I I I I
Fig. 3--One-at-a-time design
xii
+
Xl
X2
I)
-I
2)
-I
3) 4)
I I
5) 6)
-I -t
7)
I
8)
I
X3
_+f -
~
l _ Xl
x3 Jr
Fig. 4--A 23 factorial design
Experimental Mechal~ics ]
293
A P P L I C A T I O N S
RUN NO.
2) 3) 4) 5) 6) 7) 8) 9) I0) II) 12)
XI
X2
X3
0 O
-I "1 "1 -I I I I I 0 0
0 0
0 0
I -l I -I I -l I 0 0 0 0
I RUN NO. I) 2) 5} 4) 5} 6)
Xl
X:;,
X3
-I
-I
-I -I
I
I I
I 0 0
-I 0 0
-I
I I 0 0
I// _
.
!--; XI,
/x3
+
~+
Fig. 5--The " m i n u s " half fraction of a 2~ factorial
Fig. 5--A 23 factorial plus center points
of the data is a l e g i t i m a t e and m e a n i n g f u l c o m p a r i son of x2. It is as if the entire e x p e r i m e n t were d ev o t ed to x2. But, w e used all of the data for Xl. B y now, it should be clear that we are going to use all the data for x3, in t h a t the first 4 runs pair off w i t h the last four. Looking at t h e cube, the comparisons of the responses in the left face w i t h those in the right face tell us about xl, comparison of those in t h e b o t t o m face with those in the top face tell us about x2, and comparison of those in the f r o n t face w i t h those in the back face tell us about xs. Not only are factorial e x p e r i m e n t s efficient (inf o r m a t i o n p er observation) in the above sense, but each i n d e p e n d e n t v a r i a b l e is investigated over a wide r an g e of conditions of the r e m a i n i n g v a r i ables. W i t h this b r o a d e r base of experimentation, one is on firmer ground in d r a w i n g conclusions. T h e r e still remain, however, some additional features wh i ch will be e x p l o r e d shortly. We note, first, that t h e r e w e r e no i n d e p e n d e n t l y repeated runs in this configuration, so let's increase the n u m b e r of runs to 12, as in Fig. 3. The collection of v a r i a b l e settings in Fig. 5, p r o p e r l y r u n in a r a n d o m order, represents the design actually used in the tool-life e x p e r i m e n t and is obtained by a u g m e n t i n g t h e 23 factorial with four center points. We can still p e r f o r m the simple analysis for xl, x2, and x3 as p r e v i o u s l y indicated but, in addition, these center points enrich our i n f o r m a t i o n in a v a r i e t y of ways. First of all, the v a r i a b i l i t y among the four responses at the c e n t e r cannot be a t t r i b u t e d to the i nd ep en d en t variables, cutting speed, feed rate and depth of cut, but is a manifestation of e x p e r i m e n t a l error. That is, t h e spread among these repeated runs reflects m e a s u r e m e n t error, i n h e r e n t m a t e r i a l s variation, changes in am b ie n t conditions, etc. Thus, these runs p ro v i d e us w i t h a y a r d s ti c k of "noise" that can be used in t h e testing of hypotheses and i n t e r v a l est i m a-
~94 ] lune 1972
tion. F u r t h e r m o r e , since we h a v e postulated a planar response for the log of tool life, the average of the 4 responses at the center should be the same as t h e av er ag e of the 8 responses at the vertices of the cube, except for r a n d o m error. Thus, we can use the data to d e t e r m i n e w h e t h e r the response function m a y contain q u ad r at i c effects--i.e., are t er m s xi 2 needed in the model? Finally, the m a g n i tude of the effect of x~ m i g h t change w i t h or be dependent upon the p a r t i c u l a r level of xj. We describe this n o n a d d i t i v i t y of effects or i n t er de pe ndence as interaction. Indeed, one e x t r e m e l y i m p o r tant feature of factorial e x p e r i m e n t s is that it enables these joint effects to be assessed. That is, we can e x a m i n e w h e t h e r t h e r e is any evidence of the necessity of including t e r m s of the form xlxj in the model. This is not t r u e of the o n e - a t - a - t i m e experiment. R e t u r n i n g to the tool engineer's initial assumption that a p l a n a r - r e s p o n s e function could adeq u a t e l y r e p r e s e n t the log of tool life over the region of interest, one m a y ask w h e t h e r all 12 runs are necessary. The n u m b e r of observations required for a given e x p e r i m e n t is obviously dependent upon the m a g n i t u d e of e x p e r i m e n t a l e r r o r and the r e q u i r e d precision of t h e results, as w e l l as the model unde r consideration. T h e r e is m u c h in the statistical lite r a t u r e dealing w i t h this problem, and w e will not discuss this i m p o r t a n t subject h e r e (e.g., see Refs. 2-9). However, w e can still discuss alternate configurations of e x p e r i m e n t a l runs. Consider the 6 runs p r esen t ed in Fig. 6. Note that 4 of the 8 vertices of a cube are represented, along w i t h 2 center points. However, there are 8 ~ = 70 w a y s of choosing one half of the v e r 4/ rices, and t h e y are not e q u a l l y informative. Note that the p a r t i c u l a r set of four chosen has complete balance or s y m m e t r y built in. That is, half of the runs are at the low l ev el and half at the high level of each of the 3 factors. Or, in a geometric vein, half are in each face of the cube. This type of configuration is called a fractional factorial and is a balanced subset of a complete factorial.
APPLICATIONS
RUN NO. I) 2) 3) 4) 5) 6)
Xl
X
-I
Z
I
l -I I 0 0
-I -I I 0 0
X5 -I -I
I I 0 0 +
Fig. 7--The "plus" half fraction of a 2~ factorial
~O
+
Fig. 8--A central composite design
This collection of e x p e r i m e n t a l runs has m a n y of the same features of the l a r g e r set: it covers the region f a i r l y well, we can easily estimate the p a r a m e t e r s in a first-order model in 3 variables, compare the 2 center points with the 4 p e r i m e t e r runs to get a clue on nonplanarity, and use the v ar i ab i l i t y b e t w e e n the 2 center points to get some feeling for the m a g n i t u d e of e x p e r i m e n t a l error. The real efficiency of fractional factorials occurs with e x p e r i m e n t a l p r o g r a m s i n v o l v i n g a large n u m b e r of v a r i a b l e s - - s a y , in a screening situation. However, suppose that two batches of m a t e r i a l are to be used in this experiment, and we expect considerable b a t c h - t o - b a t c h variation. We would r u n the above 6 tests on the first batch, and the combinations p r e sented in Fig. 7 on the second batch. In the statistical l i t e r a t u r e each batch of runs is called a block. Blocking p r e v e n t s k n o w n or suspected extraneous sources of variation f r o m influencing the i n d e p e n dent variables u n d e r study, and increases the precision o f the e x p e r i m e n t by r e m o v i n g these influences from t h e m e a s u r e of e x p e r i m e n t a l error. F o r ex ample, two e x p e r i m e n t e r s could h a v e p e r f o r m e d the tool-life study. Some other e x a m p l e s of blocks are m a t e r i a l batches, f u r n a c e runs, operators, t i m e periods, testing machines, etc. If w e superimpose Fig. 6 on Fig. 7, we end up with the same set of runs listed in Fig. 5. However, we can analytically r e m o v e and assess the m a g n i t u d e of the block difference. Not only is the p r o c e d u r e of subdividing the entire e x p e r i m e n t a l p r o g r a m into blocks a useful t e c h n i q u e for "controlling" such things as t i m e trends, but, at each stage of experimentation, one can use the accu m u l at ed i n f o r m a t io n to modify his original v i ews and m a k e changes in the original experiment. F o r example, if e a r l y in the e x p e r i m e n t it is clear that the l o w level of xl gives poor results, one can move to a different region. That is, w h e n set-up costs are not too high and the t u r n - a r o u n d time is not too long, a sequence of blocks m a y be the most efficient procedure. To illustrate a f u r t h e r f e a t u r e of good e x p e r i m e n t a l design, suppose that we ran the dozen runs indicated in Fig. 5, and we had e v i d e n c e that a
first-order m o d el was an i n a d e q u a t e r e p r e s e n t a t i o n of the response function. Since the t r u e function is unknown, most engineers w o u l d assume that, o v e r the region of interest, the u n k o w n function could be reasonably a p p r o x i m a t e d b y a Taylor's Series expansion, t r u n c a t e d af t er the s e c o n d - o r d e r terms. This series a p p r o x i m a t i o n can be expressed as a s e c o n d - o r d e r polynomial. Th e complete s e c o n d - d e gree polynomial in t h r e e variables m a y be r e p r e sented as: n = ~o q- fllxl q- ~2x2 -q- ~3x3 § ~11xl2 -s ~22x22
+ pzzxz ~ + [312xlx2 + [~lzxlxz + [32zxex~. Note that the first four t e r m s are those of a plane (Table 4), the n e x t t h r e e are quadratic terms, a n d the last t h r ee ar e b i l i n e a r or interaction terms. Since t h er e are 10 u n k n o w n p a r a m e t e r s in this model, we need at least 10 distinct points in t he factor space to estimate these t~'s. F o r t u n a t e l y , we don't h a v e to start all over, b u t can a u g m e n t the existing design to enable us to e v a l u a t e each of the parameters. If w e select six additional runs, two along each of the t h r ee m a j o r axes, we h a v e w h a t is called a central composite design. If w e also w a n t to t r e a t the n e w group of runs as a block, w e should also t a k e a pair of runs at t h e center. The g e o m e t r i c a l r e p r e s e n t a t i o n of a central composite design in t h r ee v ar i ab l es is p r esen t ed in Fig. 8. The spacing one uses for t h e "star points" along the faajor axes is d e t e r m i n e d by one's state of knoWledge of the system and the goals of t h e ex p e r i me nt . The i n t e r e s t e d r e a d e r is encouraged to see the papers by George Box 1 and his colleagues concerning the subject of "Response Su r f ace Methodology". The feature just described, then, is that of s e q u e n t i a l l y building a design to facilitate e x a m i n a t i o n of a h i g h e r - o r d e r model. This was done by first e x p l o r ing a first-order m o d el and then a u g m e n t i n g the design used for that purpose by a small set of e x t r a observations.
Experimental Mechanics I 295
APPLICATIONS
Summary Let us n o w r e v i e w th e desirable design features present in the tool-life problem. First, after a series of p r e l i m i n a r y considerations, the objectives of the e x p e r i m e n t w e r e determined. They w e r e to find the settings of the t h r e e variables---cutting speed, feed rate and depth of c u t - - w h i c h m a x i m i z e d tool life, and secondly to describe t h e response surface in the region of experimentation. Next, the e x p e r i m e n t a l region (Table 3) was d e t e r m i n e d and a model (Table 4) was postulated as r e p r e s e n t a t i v e of the t r u e response in the e x p e r i m e n t a l region. Finally, a design (Fig. 5) was chosen to car r y out the experiments. Runs are made in r a n d o m order to p r e v e n t systematic effects, such as a t i m e trend, f r o m i n v a l i d a t i n g the results. T h e outstanding features of the 23 factorial plus center points are: (1) the total 12 runs m a y be blocked, i.e., divided into two groups of 6 runs, each half being adequate to fit the hypothesized p l a n a r model; (2) both the complete design in 12 runs and the half fraction in 6 runs are easy to visualize, easy to analyze, and a d e q u a t e l y cover the region of e x p e r i m e n t a l interest; (3) there are enough observations to allow estim a t i o n of the e r r o r variance; (4) the replicated center points p r o v i d e the capability of testing for the inadequacy o f the model; and (5) a second-order m o d e l can be investigated seq u e n t i a l l y by simply a u g m e n t i n g the first order, factorial design w i t h an additional block of runs (Fig. 8). Finally, it should be noted that n o n li n ear models (Table 1) can also be investigated by designs such as t h e one discussed here. These n o n l i n e a r models are often theoretical models based on the "laws of nature", r a t h e r than empirical models fitted to the data. Hen ce the forms of the m o d e l are often difficult to analyze and computation of estimates b ecomes an i t e r a t iv e procedure. However, in some instances, a theoretical, n o n l in e a r m o d el can be t r a n s f o r m e d to a linear model. In fact, t he original model proposed for the tool-life p r o b l e m was k _~ T v a f b d e
w h e r e k, a, b and e are constants and T, v, f and d are the v ar i a b l e s defined in Table 2. Note that by taking the n a t u r a l log on both sides of the equality, we obtain logk :
l o g T q- a l o g v
q- b l o g f
Jr elogd.
R e a r r a n g i n g t e r m s and letting 0o = log k, 01 = -- a, o2 = -- b, and 0~ = -- e, we t h e n obtain t h e linear m o d el in Table 4. Thus, an e x p e r i m e n t e r should be alert to the possibility of g r e a t l y simplifying his calculations by tr a n s f o r m a t io n s of the v a r iables in his theoretical, nonlinear model. The r e a d e r has, no doubt, noted that the type of design presented here is not necessarily the best
296 I J u n e 1 9 7 2
design for all situations. Indeed, the design to be used is definitely a function of the goals of the experiment. It is quite conceivable that one w oul d r e q u i r e a design w h i ch concentrates on a small area of the e x p e r i m e n t a l region, r a t h e r than s y m m e t r i c a l l y covering the entire region. As a general rule, the less one knows of his process, the m o r e i m p o r t a n t it is to cover the entire e x p e r i m e n t a l region well. This is especially i m p o r t a n t w h e n p e r f o r m i n g a screening e x p e r i m e n t in w h i ch the goal is to find the m o s t - i m p o r t a n t v a r i a b l e s in a process f r o m a l ar g e n u m b e r of possible variables in a small n u m b e r of runs, If the goal is to accurately predict a response in a cer t ai n region, as opposed to estimating a particular set of p a r a m e t e r s p r e cisely, t h e n the f a c t o r i a l - t y p e design presented here is again especially useful and efficient. When an e x p e r i m e n t e r finally reaches a state of k n ow l e dge about his process at which he either knows the best model to describe his response and wishes to obtain the best estimates of the p a r a m e t e r s of his model, or wants to discriminate b e t w e e n two or t h re e possible theoretical models, t h e n a more specialized design is often required. H o w e v e r , even here, a factorial design is often used as a building block to be a u g m e n t e d by a collection of specially chosen design points, often p e r f o r m e d sequentially. The discussion above clearly indicates the need for continuous interaction b e t w e e n the e x p e r i m e n t e r and the statistician. The e x p e r i m e n t e r must state clearly his goal for each ex p er i m en t . As his state of k n o w l ed g e advances, his goals change. As his goals change, his design needs change. It is the statistician's role to see that these needs are f u l filled in the most efficient w a y possible. C o m m u n i cation b e t w e e n the e x p e r i m e n t e r and statistician is t h e key, then, to producing an e x p e r i m e n t for w hi c h the best design goes in and the most information comes out. References 1. Box, G. E. P., "'The Exploration and Exploitation of Response Snrfaces,'" Biometrics, 10, 16-60 (1954). 2. Box, G. E. P. and Hunter, J. S., "'The 2 ~-~ Fractional Factorial Designs," Technometrics, 3, Part I, 311-351; Part II, 449-458 (1961). 3. Cochran, W . G. and Cox, G. M., Experimental Designs, 2nd ed., ]ohn Wiley and Sons, New York, 1966. 4. Conner, W . S., Zelen, M. and Deming, L., "'Fractional Factorial Experimental Designs for Factors at Two Levels," National Bureau of Standards, Applied Mathematics Series, 1957. 5. Cox, D. R., Planning of Experiments, lohn Wiley and Sons, New York, 1965. 6. The Design and Analysis of Industrial Experiments, ed. by O. L. Davies, 2nd ed., Hafner Publishing Company, New York, 1956. 7. Hooke, R., Introduction to Scientific Inference, Holden-Day, Inc., San Francisco, 1963. 8. Peng, A., The Design and Analysis of Scientific Experiments, Addison-Wesley, Reading, Mass., 1967. 9. Wilson, Jr., E. B., An Introduction to Scientific Research, McGraw-Hill, New York, 1952. 10. Wu, S. M., "'Tool-Life Testing by Response Surface Methodology," Journal of Engineering for Industry, Trans. ASME, Series B 86, Part 1, 105-110; Part II, 111-116 (1964). 11. Wu, S. M. and Meyer, R. N., "'Cutting-Tool Temperature. Prediction Equation by Response Surface Methodology, "'Journal of Engineering for Industry, Trans. ASME, Series B 86 150-156 (1964).