Measurement processes in nature are by nature multidimensional. Communities of organisms are complex because of this multitude of measurable processes, which may or may not interact. Seldom do ecologists fully grasp more than 4 of these dimensions in the same problem, though the conventions of mathematics allow us to work with p >> 4 of them at a time. But perhaps the most challenging initial condition of modeling community complexity (in order to claim ecological knowledge) is deciding which side of the equals sign has our focus. So the question is, are we of a right-wing or leftist persuasion. Leftists are more focused on prediction. Does the model work with accuracy and precision? That is the long and short of it. Right-wingers must have an explanation for why the left side is what it is. We present here the results of four methods for modeling organismal community complexity using the dimensionality reduction techniques of multivariate statistics. These methods are known as principal components regression (the other PCR), multivariate multiple regression (MMLR), canonical correlation analysis (CCA) and partial least squares regression (PLSR). A single dataset was analyzed comprising fish species abundances from 37 stream reaches in the Uinta Mountains in Wyoming.
Results/Conclusions
Four response measures (species diversity and richness, and functional diversity and richness) collected per reach showed strong positive pairwise correlations among all six combinations. In the PCR method, the 1st five principal components (PCs) among the p = 18 terrestrial and aquatic stream habitat regressor measures were used to model each individual response. But the leftist predictive power of such was an equally complex interpretation of the factors being described by each PC, for each response. MMLR was an improvement, if right-wing explanation was the modeling goal, although the economy of model selection required some objective criteria, subject to modeler preference. CCA provided both leftist and right-wing views using dimensionality reduction on both sides of the equal sign, but the aim was maximizing correlation between canonical variates, not modeling. In the end, PLSR provided the most meaningful left- and right-of-center modeling results of community complexity responses as a function of, and while accounting for the error structure in the latent regressor variables. We regard PLSR as a valuable tool for summarizing typically complex multivariate relationships within ecosystems, one that will have a ready application in species conservation for distinct evolutionary regions of our planetary ecography.