Using species distribution models to infer the importance of factors limiting species’ ranges
Species distribution models are commonly used to infer the importance of range-limiting environmental factors. In contrast to the large number of studies evaluating how well models predict ranges, very few—if any—have evaluated the ability of models to correctly rank factors by their relative importance to limiting ranges. Here we investigate the ability of commonly-employed tests of variable importance to identify influential factors. We simulated species with a variety of functional response forms: univariate and bivariate versions of linear vs. step vs. Gaussian functions with or without covariance. These were contrasted with different configurations of the variables on the landscape: random vs. linear vs. non-linear, each of which were correlated or uncorrelated with other variables and influenced the species’ range or were irrelevant yet included as a predictor in the model. We used three metrics to measure variable importance: 1) change in model performance (PERFORM); 2) correlation between predictions from a full model and from a univariate model with the predictor of interest (UNIVAR); or 3) correlation between predictions from a full model and a full model in which the variable(s) of interest have been permuted (PERMUTE—used in the Maxent and BIOMOD software).
The reliability of the three test metrics depended mostly on 1) the number of variables included in the model regardless of their relevancy; 2) the covariance between variables in the species’ response to the environment; and to a lesser degree 3) the correlation between variables on the landscape. The PERFORM and PERMUTE tests were unable to correctly identify important variables when the total number of relevant and irrelevant variables was more than just a few (≥3-5). The PERMUTE test had lower statistical power when variables were uncorrelated on the landscape. The UNIVAR test was usually a reliable measure of variable importance regardless of the number of relevant and irrelevant variables in the model. However, it incorrectly chose irrelevant variables when covariance in the species’ functional response to the environment was strongly positive. When only irrelevant variables were included in a model all three tests assigned greater importance to those that were correlated with the relevant one(s). Model-specific tests (AICc for GAMs and change in regularized gain for Maxent) were less sensitive to variable importance than the three model-free tests. Our results provide guidance for which tests to use under different circumstances and highlight the importance of attending to overparameterization.