An issue that arose during the validation was the high degree of correlation between some pairs of variables. This made it difficult to determine whether the correlated variables were being used interchangeably or whether both variables were useful despite their high correlation (i.e. the uncorrelated component of the variables contained additional information). To help resolve this issue, we calculated combination variables and used these in place of one of the correlated variables particularly in the generalised additive model analyses. The combination variable (x') expressed the deviation of a variable (x) from its expected value, expressed in standard deviations (s), given the value of the second variable (y):
The following combination terms were created:
Results of the validation analyses are summarised in Table 5, Figures 11 and 12. Results for the GAM and CART models show the variables ranked by the order that they were fitted (GAMs) or by overall marginal contribution of each environmental variable to the final model (CART). Results for the ANOVA and CCA analyses show the relative explanatory power of each of the variables individually.
Table 5: Comparison of the order of importance in chlorophyll models
|
GAMs rank order |
Trees rank order |
|---|---|
|
SST winter |
Rad_mean |
|
Rad_mean&Rad_wint |
Rad_mean&Rad_wint |
|
Depth |
Depth |
|
SST gradient |
SSTwint&Rad wint |
|
SST annual amplitude |
Sediment |
|
Rad_mean |
SSTanamp |
|
Tidal |
Tidal |
Rad_wint |
Freshwater* |
Orb_v_mean |
Orb vel comb* |
Sediment# |
Orb_v_mean# |
Freshwater# |
SSTgrad# |
Orb_v_95# |
SSTanom# |
SSTanom# |
Note: GAMs used only one combination term (Rad_mean&rad_wint). Variables shaded blue (*) made a contribution of less than 1% to the model (for tree models) and variables shaded red (#) were not selected.
Note: The ANOVA F values have been scaled so that the highest value is 0.5 so they could be graphed at the same scale as the other values. The variables are ordered by the average of their marginal contribution to results that were based on the tree analyses.
Note: The ANOVA F values have been averaged over the three datasets then scaled to maximum 0.5 so they could be plotted on the same scale as the other data. The variables are ordered by the average of their marginal contribution to results that were based on the tree and BVSTEP analyses.
Depth was indicated a very important variable by almost all analyses and with all datasets. Annual mean surface solar radiation was important to the chlorophyll and fish datasets and was of some importance to some benthic models. Annual amplitude of SST was important to the fish models, and made some contribution to the chlorophyll and shelf models. The combination of mean annual solar radiation and winter solar radiation was important to the chlorophyll and fish models and had a medium contribution to the shelf models. The combination of wintertime SST and winter solar radiation was important to the chlorophyll and fish models and of medium importance in the benthic models. Tidal current had a small contribution to the chlorophyll, fish and shelf models and was correlated (CCA and ANOVA) with the ophiuroidea and asteroidea community datasets. Spatial gradient annual mean SST was not important to chlorophyll, was of medium importance to the fish models, and made only a small contribution to the shelf and echinocardium models. Mean orbital velocity was not important for predicting chlorophyll, but was correlated with fish communities (CCA) and asteroidean communities (ANOVA), and made a significant contribution to models of the shelf data. Summertime SST anomaly was not important in the chlorophyll analyses, but made a small contribution to some individual fish and shelf species models. Sediment type had a medium contribution to the chlorophyll and fish models and a small contribution to the shelf models. The combination of mean and extreme orbital velocity made a small contribution to the chlorophyll models but showed little relationship with any other dataset. The seabed shape variables (profile, curvature, plan) made small contributions to the tree models of fish species, fish groups, and both the community and species analyses (ANOVA, CCA) for the ophiuroidea, asteroidea and benthic shelf datasets. Freshwater fraction had only a weak correlation with the benthic shelf survey dataset. Based on the results of the validation analyses, Weatherhead and Snelder (2003) ranked the 15 candidate environmental variables based on their average contribution across all analyses and biological datasets. The relative contribution of the environmental variables had the following order:
Ranking of the variables was used to summarise the results of the validation study performed for the Hauraki Gulf (Hewitt & Snelder 2003). The relative importance of each variable was ranked (on scale of 0 to 1) over each analysis and then averaged for the plankton, benthos and fish datasets (see Table 6). An overall ranking was made by summing across each biological dataset (see Table 6). The most apparent conclusion was that the variables selected differed between the datasets. Unsurprisingly, variables representing water column processes and sea-surface temperature are better correlates for the plankton; whereas sea-bed variables such as topography, sediment rank/type, bed velocities and currents were better correlates with the benthos. Fish represented a middle point between these two.
Table 6: Summary of the relative importance of variables derived from models of the plankton, benthos and fish, in decreasing order from the most to the least important