OOS 41-3 - Trait-based risk assessment for invasive species: High performance across diverse taxonomic groups, geographic ranges, and machine learning/statistical tools

Thursday, August 11, 2011: 2:10 PM
17B, Austin Convention Center
Reuben P. Keller1, Dragi Kocev2 and Saso Džeroski2, (1)Institute of Environmental Sustainability, Loyola University Chicago, Chicago, IL, (2)Department of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia
Background/Question/Methods

Trait-based risk assessment for invasive species is becoming an important tool for identifying nonindigenous species that are likely to cause harm. Despite this, concerns remain that the invasion process is too complex for accurate predictions to be made. Our goal was to test risk assessment performance across a range of taxonomic and geographical scales, at different points in the invasion process, with a range of statistical and machine learning algorithms. We selected six datasets differing in size, geography and taxonomic scope. For each dataset, we created seven risk assessment tools using a range of statistical and machine learning algorithms. Performance of tools was compared to determine the effects of dataset size and scale, the algorithm used, and to determine overall performance of the trait-based risk assessment approach.

Results/Conclusions

Risk assessment tools with good performance were generated for all datasets. Random forests and logistic regression consistently produced tools with high performance. Other algorithms had varied performance. Despite their greater power and flexibility, machine learning algorithms did not systematically outperform statistical algorithms. Geographic scope of the dataset, and size of the dataset, did not systematically affect risk assessment performance. Across six representative datasets we were able to create risk assessment tools with high performance. Additional datasets could be generated for other taxonomic groups and regions, and these could support efforts to prevent the arrival of new invaders. Random forests and logistic regression approaches performed well for all datasets and could be used as a standard approach to risk assessment development.

Copyright © . All rights reserved.
Banner photo by Flickr user greg westfall.