Predicting habitat suitability for invasive species in the Great Lakes: Combining species distribution models and high resolution aquatic variables
Given the large ecological and economic impacts of invasive species, forecasting risk of invasion by specific species is valuable. A key component of assessing invasion risk is determining whether species can persist if introduced to a novel location. Species distribution models offer an appealing tool to estimate habitat suitability, but invasion defies the model assumption of stationarity in species occurrence. Aquatic species offer an additional challenge because fitting these models depend on environmental covariates with wide coverage, and the readily available covariates don’t include variables measured in aquatic habitats. To most effectively combine available information to forecast habitat suitability for invasive species we first assessed the performance of several species distribution modeling techniques on a suite of five aquatic invasive species. The output of the distribution models was then constrained by new high resolution environmental data from within the Great Lakes to provide a more focused forecast of suitable habitat for five known invaders: golden mussel, killer shrimp, grass carp, hydrilla and northern snakehead.
We compared widely used distribution modeling techniques, including MaxEnt, boosted regression trees, and random forests, with newer approaches, including range-bagging and lobag, and found a tradeoff between accuracy and overfitting. The most accurate models, measured by AUC, also tended to overfit. Range bagging performed well given this tradeoff and has the additional advantage of attempting to fit the species range, rather than separating presence and background points. Maps of suitability in the Great Lakes that combined climatic habitat suitability and within-lake data on benthic temperature, light penetration and aquatic vegetation provided high resolution maps of areas most suitable for each of five known invaders. These maps offer higher utility to managers than global-scale niche models. We argue that combining distribution models with the relevant performance characteristics with any additional information on aquatic conditions and species environmental limits most effectively forecasts areas at risk. It also identifies missing information and highlights the importance of collecting aquatic environmental data at high spatial resolutions.