Vegetatio Vol. 29, 2: 8%99, 1974
CATENATION: QUANTITATIVE METHODS FOR THE DEFINITION OF COENOCLINES* Imanuel NOY-MEIR** Department of Botany, The Hebrew University of Jerusalem, Jerusalem, Israel. Keywords: Catenation, Coenocline, Continuity analysis, Ordination, Parametric mapping
Introduction Methods of ordering phytosociological samples are usually divided into two major approaches (e.g. Greig-Smith, 1964): classification, which places samples in a structure of discrete groups, and ordination, which arranges samples in reference to continuous axes. Axes of ordinations have often been interpreted as vegetational 'gradients' or 'coenoclines' (Whittaker, 1960, 1967, 1970; van der Maarel & Leertouwer, 1967) which were expected to correspond to environmental gradients. However, practically all mathematical and graphical ordination methods used by plant ecologists imply a linear relationship between the axes or components extracted and the original species variables.. Linearity is inherent both in principal components analysis in its many versions (e.g. Goodall, 1954; Orloci, 1966, 1973), and in the stand-defined ordination techniques (Bray & Curtis, 1957; Orloci, 1966; van der Maarel, 1969;'Swan, Dix & Wehrhahn, 1969; Gauch & Whittaker, 1972b). But the response of species to environmental gradients is in general far from linear; if a 'bellshaped' optimum-curve response is assumed, environmental gradients or coenoclines will be represented as curves in ordination-space (Swan, 1970; Noy-Meir & Austin, 1970; Austin & Noy-Meir, 1971 ; Gauch & Whitta* Nomenclature of taxa mentioned in examples d and e follows Gleason, 1950, the new Britton and Brown illustrated Flora of the N.E. United States and adjacent Canada. ** This was part of the work for a Ph.D. thesis at the Department of Biogeography and Geomorphology, Research School of Pacific Studies of the Australian National University. I am grateful to my supervisors, Donald Walker and Bill Williams, for their advice, to Mike Austin and Mike Dale for useful discussions and to Hugh Gauch and Robert Whittaker for the manuscript of their paper on Gaussian ordination and for the ensuing discussion.
ker, 1972a, 1972b; Groenewoud, 1973; Whittaker & Gauch, 1973). Curved configurations of points indeed appear in many published ordinations of vegetation data (Goodall, 1954; Bray & Curtis, 1957; Ayyad & Dix, 1964; Bannister, 1968; Flenley, 1969; Moore et al., 1970; Norris & Barkham, 1970, and others). Attempts to interpret each axis as an environmental factor are likely to produce anything from slightly distorted to wholly misleading resuits. Such an interpretation of axes in linear phytosociological ordinations is valid only for narrow ranges of habitat diversity, within which species responses are approximately linear or at least monotonic. Once this problem is recognized, several approaches are possible (Austin & Noy-Meir, 197t): to limit ordination to sample sets with low 'beta-diversity' with the degree of distortion measured (Gauch & Whittaker, 1972a, 1972b; Gauch, 1973a), to interpret ordination axes as phytosociological entities of a different type (Noy-Meir, 1971), to apply linearizing transformations to the data before ordination (Swan, 1970; Austin & Noy-Meir, 1971; Gauch, 1973b) or to apply methods for detecting coenoclines without assnming linearity. This paper deals with the last possibility, and reports the results of experiments with one mathematical method which seems to be of some promise.
Definition of catenation Catenation is suggested here as a collective name for all methods which are designed to order elements (e.g. sites, species) in continuous sequences, catenae, or on multidimensional surfaces defined by several such catenae, in a way that optimally accounts for local similarities. The keywords in this definition, which distinguish catenation from ordination, are 'linear' (which is absent) and 89
they were catenations. 'Direct gradient analysis' (Whittaker, 1956, 1967) does define vegetation catenae, but these are based on extrinsic (environmental) data, not directly on phytosociological data. Methods for true intrinsic catenation are rather underdeveloped in ecology; but some methods developed in other disciplines may be useful.
• •~ d . 4"
""
•
" "..N"
"1
•
" :\." "
•
:)
Fig. 1. A curved configuration of points and a 'best' curvilinear dimension (catena) through it; z - projection of point on catena, d - distance from catena. 'local'; the latter also implies a stronger requirement for 'continuity' than in ordination. It is mainly the 'chaining' of adjacent ore relatively proximate elements which is to be efficiently described, while the relations between distant elements are less important. Geometrically, catenation involves a search for lines or surfaces of best fit, along which points tend to be concentrated in multidimensional space, without requiring that the fines be straight or the surfaces flat. It may equally be regarded as seeking an optimal curvilinear coordinate system embedded in a Euclidean space; or as attempting to 'unfold' (or 'map') a non-linear configuration into a flat one (Fig. 1). It could thus be classified as non-linear or curvilinear ordination. But it is different enough in properties and interpretation from linear ordination, and has been confused with it often enough in phytosociology, to warrant a special term. The term 'catena' for a curvilinear dimension of vegetational variation is not meant to imply any relationship with a soil °catena'. A catena may correspond to an environmental gradient, or it may correspond to a successional sere. A catena obtained from phytosociological data is very close conceptually to a coenocline (Whittaker, 1960, 1970). Catenation methods, applied to such data, are in effect methods for the detection and definition of coenoclines. That ecologists are interested in finding vegetational catenae (gradients, series, coenoclines) is apparent from the fact that ordinations have often been interpreted as if 90
Survey of methods a. Leading dominants method This original 'continuum analysis' procedure of Curtis & McIntosh (1951) aims to order stands in a sequence along which the abundances of at least the major dominants are distributed continuously and unimodally as far as possible. It is obviously a non-linear method (catenation). The geometric ordination method of Bray & Curtis (1957) was not, as claimed, just an objectivization of the original method but a switch from a unimodal to a linear model (though appareritly one which accommodates moderate nonlinearities better than other linear models; Austin & NoyMeir, 1971; Gauch & Whittaker, 1972b). In the leading dominants method the 'best' order is chosen subjectively from among various rearrangements of a matrix of quantities of various dominants in stand groups defined by the leading dominant. Other species, and individual stands, are then positioned along the sequence by more or less objective procedures. The main limitation of the method is that the subjective ordering of dominants becomes difficult when there are more than about five of them, or when a single overriding gradient cannot account for most o f the variation (Gimingham et al., 1966; Buell et al., 1966). In any case only one such catena can be extracted. b. Plexus diagrams Ecologists have often drawn by hand diagrams which express graphically relationships between stands (or communities), in which the adjacency of two stands (or the boldness of the line connecting them) is more or less proportional to some measure of the compositional similarity between them (e.g. Beals & Cope, 1964; Groenewoud, 1965). Many examples of such 'plexus diagrams' of stands, species or communities are reviewed by Whittaker (1967) and Mclntosh (1973). Since the graphical representation of the similarity matrix is partly subjective, it is not constrained to linear relationships and may achieve some 'unfolding'. This subjectivity also means that different people may derive rather different plexus diagrams from
the same data, or that (in complex data) important relationships may be overlooked. c, Polynomial ordination ('non-linear factor analysis') This approach was developed by McDonald (1962, 1967) as an extension to factor analysis. The postulated model allows the original variables (e.g. species) to depend on the factors or components (z) not only linearly, but also on z z, z 3, etc., and on products of two or more factors. McDonald shows that if there are non-linear configurations in the space defined by linear ordination, such a model is appropriate. The procedure is to obtain a normal principal components solution, search for non-linear configurations and identify the type of polynomial required, then iterate to a rotation which maximizes fit to an orthonormal polynomial of this type. As a method of catenation it is constrained to curves and surfaces of certain forms; but many common forms could possibly be approximated by loworder polynomials. The method may be useful in phytosociology. d. Multidimensional curve-seeking A method described by Sneath (1966) attempts to pull points in a curved swarm together by a 'gravitational' process in order to chain them by nearest-neighbour linkage. This method allows the detection of branched and closed sequences, but not of curved surfaces. It is computationally time-consuming and apparently rather sensitive to values of several predetermined parameters (M. B. Dale - personal communication). e. Proximity analysis' and non-metric factor analys& Various methods for finding the 'lowest dimensionality' in a given set of (usually qualitative) data, without assuming linearity, have been developed, by psychologists (Guttman, 1955; Shepard, 1962; Kruskal, 1964). Most of them do assume a monotonic relation between the dimensions and the original variables, or between distances in the new and in the original space. However, a modification by Shepard (Shepard & Carroll, 1966) requires monotonicity only locally (for short distances), and may be appropriate for the detection of coenoclines. Another modification of multidimensional scaling, by Kruskal, may also be suitable for ecological data (Orloci, 1973). f Gaussian ordination This method (Gauch, Chase & Whittaker, 1973) is based on the observation that the distribution of species along environmental gradients often shows a bell-shaped form
(Whittaker, 1956, 1967, 1970). It therefore assumes that the distribution of species along coenoclines generally is of such form, and attempts to maximize fit over all species-insites quantities, Yik, to the bell-shaped Gaussian function: Y i k ~-
Y ° e-t~-~k°)2/ck
(1)
where: yO = the quantity of species k at its optimum z i = the position of site i on the coenocline z° = the position of the optimum of species k on the coenocline (the 'mean' and 'mode' of the distribution) Cg = a measure of the width of the curve (C~ = 2 ~ , where ~rk is the 'standard deviation'). The problem is to find a set of site positions, z~, and of species distribution parameters yO, z o, Ck so that the sam of squares of deviations of observed y~g - values from those predicted by equation 1 is minimal. Since the equation is rather complex and many unknown parameters are usually involved, the finding of this least-squares solution is computationally difficult. Gauch et al. developed algorithms for an iterative search, starting from initial guesses (see also Gauch & Chase, 1973), which were effective in most cases tested. The method successfully recovered coenoclines from simulated data and from real vegetation data with a predominant gradient. g. Parametric mapping This is a rather general and unconstrained catenation method developed by Carroll (Shepard & Carroll, 1966). Since it appeared most promising it was thoroughly tested on vegetation data and is described in detail.
Continuity analysis (parametric mapping) The name 'continuity analysis' seems more appropriate for this method in an ecological context than 'parametric mapping', as suggested by Carroll. The method attempts to find a minimal number (p) of dimensions (catenae) to which the observed variables (e.g. species) are related by functions or response surfaces which are as 'smooth' or continuous as possible, but without any other constraints or assumptions as to the form of these functions or surfaces. A typical vegetation composition matrix Y (n x m) would consist of the quantities (cover, abundance, importance, or some transformation of these) of each of m species (variables) in each of n sites, stands or records (points in vegetation-space). In some cases the matrix Y 9I
may already be a simplification from an original data matrix, e.g. each 'stand' may be the average of a group of sites, or each variable may be an ordination axis or component expressing a generalized phytosociological entity (community-type or nodum - Noy-Meir, 1971). In any case, Ylk is the value for the k-th variable (species, component) in the i-th stand (or group of stands). In continuity analysis the n sites are now to be ordered along a new set of p 'underlying dimensions' or catenae, i.e. a new n x p matrix Z is obtained, where z~o is the value of the i-th site (or the position of the point representing it) on the 9-th catena. The method seeks such a set of z-values that the relationship, over all sites-points, between the ori~nal y-variables and the new z-dimensions should fit as closely as possible to functions which are as continuous as possible. In other words, the method seeks a coenocline (or coenoclines), the response curves (surfaces) ofatl species to which are as smooth as possible. The requirement for 'smoothness' or 'continuity' may be interpreted as: a very small change in each of the z-dimensions (coenocline) should involve only a small change (and not a 'jump') in each y-variable (species). Or, mathematically, the absolute value (regardless of sign) of any Ay associated with a Az should be as small as possible for very small values of Az, i.e. for adjacent points on the catena. Maximal smoothness for the relation of one variable to one catena can thus be obtained by a set of z-values for which the sum over all pairs of points i, j
,
j
~
j~z.-z~J
~
(2)
is minimal. In this equation W~jis a'weighting' which should be a measure of adjacency of the two points i, j, since smoothness is defined locally. Carroll's first suggestion for the adjacency weighting was: 1 w,j-
(Az) ~
(3)
therefore the quantity to be minimized is:
q =
~ j (Az)*
In the multidimensional case, the requirement is for smooth response surfaces of all m variables on all p catenae, therefore Ay and Az have to be replaced by multivariate distances: d?. the squared Euclidean distances between points-sites i and j in the space defined by the original variables-species: 92
dij = ~ ( y ~ - yik)2
(5)
k
and D 2, the squared distance between them in the space defined by the new dimensions-catenae: P
D2 = E (zig- zig): g
(6)
Maximum smoothness for multiple response surfaces is therefore achieved by minimizing n n d2
Q = XX~ i j LPij
(7)
However, the scale used for the new z-dimensions is arbitrary. Thus Q could be made as small as desired simply by increasing all z-values and thereby increasing D R. To resolve this indeterminacy, Q has to be normalized by the same power of D 2, e.g.
This normalized index K is unaffected by arbitrary scale changes in z and D. It was defined as the 'index of continuity' by Carroll. Actually, it is a measure of 'non-continuity', since it is minimized in order to obtain maximum overall continuity. The choice of the second power (of dz) in defining the measure of adjacency (equation 3) and of the particular form of normalization (equation 8) is to some extent arbitrary. Carroll suggested also a more general form of the continuity index: K*
S" S "~ " ,J" I v ~V V (DZ'P'-I ~ ~ ~ I L ~ ijJ J -P/~'
(9)
In this form the power parameters ~, t , ~ can have different values, provided that y = ~ - f l . Equation 8 is a special case of eq. 9, with ~ = 1,/3 = 2, y ~ - 1, which was suggested by Carroll as the initial choice. The most important of these parameters is /3, which determines the steepness of the adjacency weighting. The larger the value o f t , the stronger is the weighting in favour of pairs of points which are very close on the eatena and against points which are farther away. Larger values of fl thus imply a stronger emphasis on local (rather than global) smoothness of responses of species to the catena, or on correct representation by the catena of the Euclidean distances (similarities) between relatively similar sites only (i.e. disregarding widely separated pairs of sites). In view of the fact that Euclidean distances between sites distorts coenocline distances most
strongly for large distances (Austin & Noy-Meir, 1971; Gauch, 1973a; Groenewoud, 1973) the use of large values of fl seems justifiable. When e,/3 and p are specified, the solution is thus defmed as the set of positions of all sites-points on all dimensionscatenae, zig , for which K (or K*) is minimal. There remains the problem of finding this minimum-K solution. The program PARAMAP searches for this solution iteratively by changes in all z-values, using a method of steepest descent (in K) and starting from an initial guessed set of z-values. To aid in convergence to the minimum point, an additional 'error variable' e was introduced in the definition of K. The actual data from which the solution is computed is the matrix of distances between sites d/~ (equations 8, 9). Thus the input to the program may either be a matrix of between-site similarities or distances, or the original site by species data matrix Y (in which case the distances are calculated first). The solution depends on the specified number of dimensions or catenae, p. Catenations in p = 1, 2, 3 . . . dimensions are obtained successively, until K is considered to be small enough, or until the allowance of new dimensions no longer appreciably affects the configuration. To check this and to get a clearer general picture, the final configuration (ifp > 1) is rotated to its principal components, so that the catenae are ordered by their contribution to the variation between sites. The sum of squares of z-values for each comp o n e n t - catena, 20, then measures the amount of variation accounted for by each dimension, or the 'importance' of the catena.
Application to vegetation data The program PARAMAP was kindly provided by Mrs. Jih Jie Chang of Dr. Carroll's laboratory. Experiments with the method were carried out on the following sets of data: a. One-gradient model: Model IV from Swan (1970), 41 sites by 21 species on a single gradient with bell-shaped response curves of low overlap. b. Two-gradient model: Model 1A from Austin & NoyMeir (1971); 30 sites by 30 species on two orthogonal gradients; bell-shaped response surfaces of fairly low overlap. c. Mt Wilhelm: Tropical montane forest (Wade, 1968); 33 sites by 3 (or 4) varimax components of presence data (treated as 'species'; original number of species 184); a
prominent altitudinal gradient was evident from a curved configuration in component-space. d. Wisconsin A: Upland hardwood forest (Curtis & McIntosh, 1951); average importance values of 4 leading dominants in 4 groups of stands in which these species were dominant (treated as 'sites'); a one-dimensional continuum was obtained by the authors using the leading dominants method. e. Wisconsin B: Upland conifer-hardwood forest (Brown & Curtis, 1952); similar but 11 species in 9 dominance groups. f. Semi-arid Australian, sites: Sample of semi-arid vegetation of Southeastern Australia (Noy-Meir, 1970, 1971, 1974); 95 sites by 223 species, presence data reduced to 12 varimax components; the results from ordination suggested a major 'heath-mallee-woodland-shrubland' catena, intricately curved in several dimensions. g. Semi-arid Australian, noda: Same data, but the 4 (or 8) major 'noda', as defined by non-centered varimax components (Noy-Meir, 1971), were used as points rather than the individual sites; the distances between them were defined as the complement of 'conjunction coefficient' (normalized form of cross-products of component scores aik over sites): d~ = 1-ci~
ci~ =
(10)
alk ajk/ k
aik2 ~ a~ k
(11)
k
The conjunction coefficient (Noy-Meir, 1971) measures relative overlap or similarity between noda; the squared distance measure derived from it is equal to half the 'squared chord distance' or normalized Euclidean distance (Orloci, 1967). It should be noted that in data sets c, fand g the'variables' 0') for which a smooth response on catenae was fitted were not the original species-quantities. Rather, they were compound phytosociological variables (floristic noda or components) derived by component analysis from speciespresence data.
Methodological tests These sets of data provided some criteria for assessing the success of the method. In the one- and two-dimensional models, a catenation should ideally recover the built-in configuration of sites exactly in 1 and 2 dimensions respectively, and the allowance of further dimensions should not 93
change this configuration. Therefore further dimensions should have much lower 'importance' (2). In each of the sets of real vegetation data there was at least one major gradient approximately recognizable by other means; a successful catenation should recover this when given one dimension, and retain it when given further dimensions. A low ratio 2z/21 in the two-dimensional solution was considered as a measure for success in those cases where one major catena was expected; in the two-gradient model the ratio 23/22 in the solution with p = 3 was used instead. An additional criterion applicable to any set of data is that the configuration of points in catenation-space should not show any marked residual non-linearity (e.g. semicircular or sigmoid). The following methodological problems were examined: a. Standardization: Normalization, i.e. division by the site (or 'group') norms before the analysis (Orloci, 1967) enhanced the success of the method for sets b-e (sets f-g were already normalized; for set a it was not tried). The results presented below for these sets are all from normalized data, unless otherwise stated (see also Austin & Noy-Meir, 1971; Gauch & Whittaker 1972b; Groenewoud, 1973). b. Effect of initial configuration: With the smaller data sets (d, e, g) the method converged to the same solution from any of a n u m b e r of initial configurations tried, including random ones and ones in which the expected configuration was deliberately garbled as much as possible. With the larger sets (a, c) the same solution was obtained from initial configurations which were reasonably similar to the expected one and from some random configurations; but from some strongly garbled and some raridom configurations there was convergence to an unacceptable solution. In general, it seems advisable to use all prior information to obtain a good initial configuration and, if in doubt, try several such configurations. c. Parameters e, r : The standard values e = 1, fl = 2 were first used. The one-dimensional solution (two-dimensional for b) in most cases did recover the expected major catena fairly well, but when further dimensions were provided, the same catena tended to curve into them, an obviously nonlinear configuration resulted and the ratio 22/21 (or 23/22) was rather high. It was suspected that these 'spurious dimensions' occurred because the similarities between points near the ends of the catena were still given too much weight and tended to pull the ends together when another dimension was available. To overcome this, higher values of fl were tried, thus strengthening the weighting by adjacency. The values tried were fl = 3, 4, 8, 16, 32, mainly on data set g (with 4 and 8 noda) and partly also on b, d, e. 94
For all these sets 'success' (as indicated by low 22/21 and lack of residual non-linearity) increased up to/~ = 8 and then remained steady or fell with further increase in /L Several values of c~for each value of/~ were also tried; in general this had lesser effects but a value of c~=/~/2 seemed to be best. The 'optimal' values e = 4, /3 = 8 were thus chosen for standard use. However, these values could not be applied to the largest data sets (a, c, f) due to computational problems: exponent overflow occurred for reasons which could not be detected. Other computational snags occurred occasionally, particularly with the larger data sets (a, b, c, f); the initial choice of other parameters (e, step size, etc.) may sometimes be critical to the success of the iteration. In most cases between 20 and 30 iterations were sufficient to obtain a stable solution. The final value of the 'continuity index' K was not found to be consistently related to the success of the method in different sets of data (see also Kruskal & Carroll, 1969), though it was useful for comparing solutions with different values ofp for the same data.
Results a. One-dimensional model Given one dimension with c~= 1, fl = 2, the original order and spacing of the 41 sites on the gradient was exactly recovered. But when another dimension was allowed, this line bent into a curve; 22/)~1 was 0.62, so that by this criterion the inherentone-dimensionalityof the data could not have been inferred, had it not been known a priori. This would probably have been improved by a larger/~, if the technical difficulties could be overcome. In any case the method recovered the gradient far better than any principal components analysis.
b. Two-dimensional model Given p = 2, ~ = 1, fl = 2 with site-normalized data, the btfilt-in 6 x 5 grid of sites was recovered, though with some distortion of spacing near the corners. With fl = 8 the distortion was virtually eliminated. When 3 dimensions were specified, the grid bent into a dome-shaped surface (see Austin & Noy-Meir, 1971). With c~= 1, fl = 2, 23/22 was 0~66, i.e. the 'spurious' dimension accounted for two-thirds as much of distances as did each of the 'true' dimensions (less than had been the case with PCA, but still rather high). This was reduced to one-third by raising fl to 8 (c~ = 4), but could not be reduced further.
Tc
c. M t Wilhelm With p = 1, c~= 1, fl = 2, the altitudinal coenocline which was discernible as a curve in component-space was accurately recovered. When a second dimension was allowed, there was some spreading of points into it 0.2/21 = 0.08 or 0.52). The two-dimensional configuration showed a slight indication of sigmoid nonlinearity, but upon careful inspection of the original data the second, minor catena seemed to make sense. Larger values of fl could not be used here.
0
As O
Qr <
0
<
2 I
I--
4 0
o
Pr
Bp
6 I
8 I
9 I
Ps ADAPTATION NUMBER
Pt
d. Wisconsin A The same order of the four communities obtained by Curtis & Mclntosh (1951) was always converged onto by the present method on the first dimension, and was not altered by the provision of further dimensions. Moreover, the spacing of the four groups was almost exactly proportional to the spacing of the 'adaptation numbers' of the corresponding four dominants, the distance from Quercus rubra to Acer saccharum being twice the two other distances. In the second dimension there was a further separation between the two latter species, but its contribution was negligible (A2/21 = 0.03). Continuity analysis thus confirmed the simple one-dimensionality of this 4 x 4 matrix, suggested by the original authors.
0
0
Pb Qe Fig. 2. Position of species on first catena (catena A, on the ordinate) of Wisconsin B data, versus original adaptation numbers on the abscissa (Brown & Curtis 1952). P b - Pinus banksiana, Q e - Quercus ellipticus, P t - Populus tremuloides, Pr - Pinus resinosa, Ps - Pinus strobus, Bp - Betula papyrifera, Qr Quercus rubra, T c - Tsuga canadiensis, A s Acer saccharum. the configuration: in the version with c¢ = 4, fl = 8, the ratio 22/)° 1 was 0.17. It was not accountable by nonlinearity in the first catena. The positions of some dominance-units of the first catena diverged significantly from the 'adaptation numbers' given to the corresponding species in the original study (Fig. 2). Tsuga canadiensis was assigned the extreme position at the upper end, rather than Acer saccharum (see also Buell et al. 1966). Transposing
e. Wisconsin B A major catena, roughly corresponding to the continuum of Brown & Curtis (t952), was obtained in all versions of the analysis with p = 1, and only slightly modified when more dimensions were available. The second catena, extracted with p = 2, contributed less but not negligibly to Table 1
Average importance values of trees in stands with given species as leading dominant-Wisconsin B (from Brown & Curtis 1952) (species abbreviations: Bt = Betula lutea, Ar = Acer rubrum, others as in fig. 2). As
Te
BI
Ar
Qr
Bp
Ps
Pr
Pt
Qe
Pb
dominant
As
140
25
21
7
22
6
1
Tc
40
152
47
11
3
5
4
3
1
-
-
-
-
Qr
27
1
3
29
138
23
10
8
5
3
Bp
48
8
7
27
16
108
19
1
29
1
-
Ps
12
6
2
24
12
12
150
39
9
5
-
I
Pr
3
-
Pt
11
-
Qe
-
-
-
5
7
I
11
9
9
103
56
Pb
-
-
-
3
3
3
13
12
14
36
213
-
12
15
14
56
156
24
10
29
34
14
19
I@0
-
4
-
2
-
95
g. Semi-arid, noda B
Pr-~l Ps As
Pb
%e~
Tc
0
0
~A 0
Pt
0
.~p
Qr
Fig. 3. Two-dimensional species catenation, Wisconsin B sample set: A - major catena, B secondary catena. the positions of rows and columns for these two species in the original table (Table 1) indeed would result in a smoother trend at least for Tsucjaand would rectify a trend reversal for Quercus rubra. Otherwise the rank order of species is the same in both catenations, but there are differences in spacing. In particular, the present analysis places Quercus etIipticus much closer to Pinus banksiana than to Populus tremutoides; this again can be seen to accord better with the data matrix. The second catena derived from the continuity analysis (Fig. 3) contrasts Pinus strobus and P. resinosa with Quercus rubra, Betula papyrifera and Populus tt~nuloides (all five species being in the central part of the first catena). A closer relationship between Populus and Betula, which is not shared by the two Pinus species, is indicated in the table by an apparent bimodality in the curves of the two former species in the one-dimensional continuum. It is also reflected by the fact that the continuity analysis, given only one dimension, places Populus between the two pines and Betula. In this more complex data set the new analysis, while broadly agreeing with the original 'continuum analysis', considerably improved upon it by a more consistent utilization of the information in the matrix, and by allowing for a second dimension.
f Semi-arid, sites This was the largest data set which the method was tried on (95 points). Solutions with p = 1 or with fl > 2 could not be obtained due to computational difficulties. The only solution available was with p = 2, e = 1, fl = 2. The fix'st of these dimensions roughly corresponded to the intuitive notion of the major catena in this vegetation. However in the second dimension (22/21 = 0.62) this catena curved into a clearly non-linear (S-shaped) swarm of points, with some lateral spread away from this swarm probably representing a second 'true' catena. 96
With c~= 1, fl = 2, the one-dimensional solution was intuitively acceptable as the major catena for both sets of noda used (R4 and Rd). However, the main effect of providing a second dimension was that this catena bent into a semicircular or parabolic configuration of points (22/21 was 0.52 for R4, 0.59 for R8). This was rectified using ~ = 4, fl = 8: the four noda of R4 lay on a straight line even when a second dimension was available (22/21 = 0.0001), indicating that the relationships between them could be accounted for by a single coenocline. In the two-dimensional plot of R8, no residual non-linearity was evident, but the second dimension was not negligible (22/21 = 0.10); this suggests that a second, minor, catena superimposed on the first contributed to the differentiation amongst the eight noda. This was later confirmed by application of the method to a larger sample of the same vegetation (NoyMeir, 1974).
Discussion
It seems that, despite its problems, continuity analysis (parametric mapping) can be a valuable addition to the array- of methods available to quantitative phytosociologists, particularly since alternative methods for catenation are so far even more problematic. The method has been applied to several other sets of vegetation data, apart from those mentioned here (Noy-Meir, 1970 and 1974) and in most cases it was useful in elucidating coenocline structure in the data. Some of the remaining problems concern the exact specification of the model: which choice of parameters e and fl minimizes residual non-linearity, and what is the meaning of this choice? Of the options tested here, c~= 4, fl = 8 has been found to be 'optimal' in this sense, but this might be modified by further empirical and theoretical work (see Kruskal & Carroll, 1969). In this study Euclidean distance measures (standardized by site) were used, but other metrics might prove to be preferable. Other problems are technical, in particular the required improvement of the iteration procedure so as to minimize the likelihood of it being thrown out of operation by some data in some conditions. The use of noda, as defined by unipolar components (or of groups of stands defined otherwise), as the entities in catenation rather than individual sites has obvious advantages. It involves an enormous saving in computer time and storage; only by this means can the more effective higher values of fl be used at present. It presents a simple
sion function (equation 1). This result may be interpreted thus: if we define coenoclines so as to have the smoothest possible species-responses to them, these responses usually tend to be also unimodal and more or less bell-shaped. At least judging from the vegetation samples examined so far, this result predicts success for methods which attempt to define coenoclines with explicitly bell-shaped functions, in particular the recently developed method of Gaussian ordination (Gauch, Chase & Whittaker, 1973). The advantage of the latter method is that it directly yields the basic species distribution parameters (optimum, height, width). Gaussian ordination is similar to continuity analysis in its general a i m - t h e detection and definition of coenoclines or 'vegetation gradients', without" making unrealistic assumptions about species-to-coenocline relationships. Gaussian ordination is more specific in that it assumes a particular form of relationship, albeit the one which is supported by more ecological evidence than any other. Continuity analysis is more general in the sense that it does not assume a specific form of relationship, only a 'smooth' one. Results like Fig. 4 suggest that the two methods may often give similar pictures, and that the 'bell-shaped' model of vegetation structure is indeed a robust one. There may be of course some cases where it does not hold; then continuity analysis might indicate what other relationship between species and coenoclines should be suspected.
picture directly related to the components oftheordination, which have already been interpreted; individual sites and species can then be placed in catenation-space by reference to this ordination. The results so far indicate that the dimensions or catenae defined by continuity analysis are closely isomorphic with vegetational gradients or coenoclines as subjectively perceived by ecologists and as obtained by the 'continuum analysis' of Curtis & Mclntosh. However, in contrast to the latter method, which assumes unimodal species response curves at least for dominants, catenation by continuity analysis does not impose this constraint. Response curves are only required to be continuous: smooth bimodal or U-shaped curves would be equally acceptable. Therefore continuity analysis can be used to test whether the distributions of species or communities on catenae are in fact unimodal (or 'bell-shaped'). In an attempt at such a test, the values of each of the original variables (species or phytosociological components) in sites were plotted against the site positions on the first catena, for the Mt Wilhelm, Wisconsin and semi-arid Australian data (and later several other data sets). The resuiting response curves (e.g. Fig. 4) turned out to be either monotonically increasing or decreasing (for 'extreme' species) or unimodal. In most cases they closely resembled symmetrical 'bell-shaped' curves (or segments of such curves), of the type represented, for instance, by the Gaus-
~y--U-lr., t"--
•
--,
,I
%
,
,,
Z LL I.L
_J
/
""
tt
4~,,
o z
.=
= , ,• .
~"
~..J"
..... -am
~
•
II,.,,..~ ~
"O----.
I
l CATEN
•
......
n
i0
..._ •
~
• "~--
"IL,.
• -- _ ~.._ ~.,.~ ,~i~'~,~•
A A
Fig. 4. The distribution of communities along a coenocline, Mt Wilhelm data; abscissa-position of stands on first catena; ordinatevalues, for these stands, of 3 varimax components t]efining the 1Tlajorfloristic noda and groups of species: cloud forest ( • ) lower subalpine forest (A), upper subalpine forest ( i ) . 97
Summary Catenation is defined as the ordering of elements in continuous sequences that best accounts for local similarities without assuming linear relations. It is thus a non-linear relative of ordination. When applied to phytosociological data it is equivalent to the detection and definition of coenoclines. Several mathematical methods of catenation are available and potentially useful in phytosociology. One such method, continuity analysis (parametric mapping) has been tested on a variety of simulated and real vegetation data. Despite some computation problems, it usually succeeded in accurately recovering simulated coenoclines which were strongly curved in Euclidean vegetation space. In data from Wisconsin vegetation it defined a first catena which was similar to the 'continuum' defined by the leading dominants method; but in one case indicated some modifications and the existence of a second dimension. When applied to other sets of real data, the method detected between one and three catenae or nonlinear dimensions, which were usually closely related to environmental gradients. The relationships of species (or communities) to these catenae tended to be of a bell-shaped form, even though such a form is not explicitly assumed in the method.
Literature Austin, M. P. & I. Noy-Meir 1971. The problem of non-linearity in ordination: experiments with two-gradient models. J. Ecol. 59 : 763-773. Ayyad, M. A. G. & R. L. Dix 1964. An analysis of a vegetationmicroenvironmental complex on prairie slopes in Saskatchewan. Ecol. Monogr. 34: 421442. Bannister, P. 1968. An evaluation of some procedures used in simple ordinations. J. Ecol. 56: 27-34. Beals, E. W. & J. B. Cope 1964. Vegetation and soils in Eastern Indiana woods. Ecology 45 : 777-799. Bray, J. R. & J. T. Curtis 1957. An ordination of the upland forest communities of southern Wisconsin. Ecol. Monogr. 27 : 325-349. Brown, R. T. & J. T. Curtis 1952. The upland conifer-hardwood forests of northern Wisconsin. Ecol. Monogr. 22: 217-234. Buell, M. F., A. N. Langford, D. W. Davidson & L. F. Ohmann, 1966. The upland forest continuum in northern New Jersey. Ecology 47: 416-432. Curtis, J. T. & R. P. McIntosh 1951. An upland forest continuum in the prairie-forest border region of Wisconsin. Ecology 32 : 476-496. Flenley, J. R. 1969. The vegetation of the Wabag region, New Guinea highlands: a numerical study. J. Ecol. 57: 465-490. Gauch, H. G, Jr. 1973a. The relationship between sample similarity and ecological distance. Ecology 54: 618-622. 98
Gauch, H. G. Jr. 1973b. A quantitative evaluation of the BrayCurtis ordination. Ecology 54: 829-836. Gauch, H. G. Jr. &~G. B. Chase 1973. Fitting the Gaussian curve in ecological applications. (manuscript). Gauch, H, G, Jr., G. B. Chase & R. H. Whittaker 1973. Ordination of vegetation samples by Gaussian species distributors. (manuscript). Gauch, H. G. Jr. & R. H. Whittaker 1972a. Coenocline simulation. Ecology 53 : 446451. Gauch, H. G. Jr. & R. H. Whittaker I972b. Comparison of ordination techniques. Ecology 53 : 868-875. Gimingham, C. H., N. M. Pritchard & R. M. Cormack 1966. Interpretation of a vegetational mosaic on limestone in the island of Gotland. J. Ecol. 54: 481-502. Goodall, D. W. 1954. Objective methods for the classification of vegetation. III. An essay in the use of factor analysis. Aust. J. Bot. 2: 304-324. Greig-Smith, P. 1964. Quantitative plant ecology (2nd edition). Butterworths, London. XII + 256 pp. Groenewoud, H. van 1965. An analysis and classification of white spruce communities in relation to certain habitat factors. Can. J. Bot. 43: 1025-1036. Groenewoud, H. van 1973. Theoretical considerations on the quantitative covariation of plant species along ecological gradients with regard to the multivariate analysis of vegetation data. (manuscript). Guttman, P. 1955. A generalized simplex for factor analysis. Psychometrika 20: 173-192. Kruskal, J. B. 1964. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29 : 1-27. Kruskal, J. B. & J. D. Carroll 1969. Geometrical models and badness-of-fit functions. Multivariate analysis II, ed. P. R. Krishnaiah, 639-671. Academic. Maarel, E. van der 1969. On the use of ordination models in phytosociology. (Germ. summ.) Vegetatio 19: 21M6. Maarel, E. van der & J. Leertouwer 1967. Variation in vegetation and species diversity along a local environmental gradient. Acta Bot. Neefl. 16: 211-221. McDonald, R. P. 1962. A general approach to nonlinear factor analysis. Psychometrika 27: 397-415. McDonald, R. P. 1967. Numerical methods for polynomial models in nonlinear factor analysis. Psychometrika 32: 77-112 McIntosh, R. P. 1973. Matrix and plexus techniques. (Germ. summ.) In: Ordination and Classification of Communities, ed. R. H. Whittaker. Handbook of Vegetation Science 5: 159-191. Junk, The Hague. Moore, J. J., P. Fitzsimmons, E. Lambe & J. White 1970. A comparison and evaluation of some phytosociological techniques. Vegetation 20: 150. Norris, J. M. & J. P. Barkham 1970. A comparison of some Cotswold beechwoods using multiple-discriminant analysis. J. Ecol. 58 : 603-620. Noy-Meir, I, 1970. Component analysis of semi-arid vegetation in southeastern Australia. Ph.D. thesis, Australian National University, Canberra. Noy-Meir, I. 1971. Multivariate analysis of the semi-arid vegetation in South-eastern Australia, I. Nodal ordination by component analysis. Proc. Ecol. Soc. Austr. 6: 159-193.
Noy-Meir, I. 1974. Multivariate analysis of the semi-arid vegetation in South-eastern Australia. II. Vegetation catenae and environmental gradients. Austr. J, Bot. 22:115 140. Noy-Meir, I. & M. P. Austin 1970. Principal component ordination and simulated vegetational data. Ecology 51 : 551-552. Ofloci, L. 1966. Geometric models in ecology. I, The theory and application of some ordination models. J. Ecol. 54: 193-215. Ortoci, L. 1967. An agglomeration method for classification of plant communities. J. Ecol. 55 : 193-206. Orloci, L. 1973. Ordination by resemblance matrices. (Germ. summ.) In: Ordination and Classification of Communities, ed. R. H. Whittaker. Handbook of Vegetation Science 5: 251-286. Junk, The Hague. Shepard, R. N. 1962. The analysis of proximities: multidimensional scaling with an unknown distance function. I and II. Psychometrika 27:125-140 and 219-246, Shepard, R. N. & J. D. Carroll 1966. Parametric representation of non-linear data structures. In: Multivariate analysis, ed. P. R. Krishnaiah, P. 561-592. Academic. Sheath, P. H. A. 1966. A method for curve seeking from scattered points. Compnt. J. 8:383 391. Swan, J. M. A. 1970. An examination of some ordination problems by use of simulated vegetation data.Ecology 51 : 89-102. Swan, J. M. A., R. L. Dix & C. F. Wehrhahn 1969. An ordination
technique based on the best possible stand-defined axes and its application to vegetational analysis. Ecology 50: 206-212. Wade, L. K. 1968. The alpine and subatpine vegetation of Mt. Wilhelm, New Guinea. P h . D . Thesis, Australian National University, Canberra. Westhoff, V. & E. van der Maarel 1973. The Braun-Blanquet approach. In: Ordination and Classification of Communities, ed. R. H. Whittaker. Handbook of Vegetation Science, 5: 61%726. Junk, The Hague. Whittaker, R. H. 1956. Vegetation of the Great Smoky Mount~iins. Ecol. Monogr. 26: 1-80. Whittaker, R. H. 1960. Vegetation of the Siskiyou Mountains, Oregon and California. Ecol. Monogr. 30: 279-338. Whittaker, R. H. 1967. Gradient analysis of vegetation. Biol. Rev. 49: 207-264. Whittaker, R. H. 1970. The population structure of vegetation. (Germ. suture.) In: Gesellschaftsmorphotogie (Strukturforschung), ed. R. Ttixen, Ber. Syrup. Int. Ver. Vegetationskunde, Rinteln 1966: 39-62. Whittaker, R. H. & H. G. Gauch, Jr. 1973. Evaluation of ordination techniques. (Germ. summ.) In : Ordination and Classification of Communities, ed. R. H. Whittaker. Handbook of Vegetation Science 5 : 289-322. Junk, The Hague. Accepted 7 January 1974
99