PS 87-174
Comparing species-specific tuning versus AICC to select optimally complex ecological niche models

Friday, August 9, 2013
Exhibit Hall B, Minneapolis Convention Center
Peter J. Galante, Biology, The City College of New York- CUNY, New York City, NY
Robert Boria, Biology, City College (CUNY), NY, NY
Robert P. Anderson, Biology, City College of New York, City University of New York, New York, NY
Background/Question/Methods

We compare two strategies for identifying optimal model complexity for ecological niche models (ENMs): species-specific tuning versus information criteria (AICc). ENMs are widely used, yet selecting optimal model complexity remains an outstanding issue. One strategy uses omission rates and AUC/ROC, calculated on withheld test data. Another uses AICc, which selects models that predict training data most accurately without being overly complex. Here, we compare the strategies using occurrence records that were spatially filtered to reduce the effects of sampling bias. We do so for a species with few records, the Malagasy tenrec Oryzorictes hova, using 19 bioclimatic layers and MaxEnt. We vary model complexity, employing different combinations of feature classes and regularization-multiplier values. First, for species-specific tuning, we implement a jackknife approach on occurrence data, calculating the average AUCs and omission rates of the withheld (test) records for each feature-class/regularization-multiplier combination. Second, for the same combinations, we calculate AICs using ENMtools using all occurrence records for model training.

Results/Conclusions

In this system, the two strategies led to the selection of similar combinations of feature class and regularization multiplier. Species-specific tuning indicated the optimal settings to be Linear + Quadratic with a regularization multiplier of 2.5. Similarly, AICc led to Linear + Quadratic and a regularization multiplier of 2.0 as optimal. Additional research is needed to determine the generality of the current conclusions for similar datasets, as well as for those with larger sample sizes and/or affected by stronger sampling bias.