Widespread loss of habitat has led to many efforts to reserve and protect habitat to conserve species at risk. The many socioeconomic and spatial constraints on simultaneously protecting multiple species effectively and efficiently is an extremely difficult problem. Consequently, numerous mathematical approaches to spatial allocation of reserves have been developed under the umbrella of "systematic conservation planning".
These methods perform well when input data is correct and socio-political realities are left out of the problem, but there is currently no good way to reliably estimate bounds on performance when their inputs are uncertain and they are embedded in real-world processes. While there are numerous case studies detailing instances where uncertainty had large effects on methods, conservation managers have no way to tell how well the method will perform in their own situation. This is particularly true when the recommended set of reserves to protect is taken into the real world of political processes, time delays, and questions about species persistence within the selected set.
We use machine learning and synthetic datasets to address the question of whether there are conditions where we can put useful bounds on the performance of reserve selection methods under complex conditions resembling the real world. To answer this question, we build a multi-step simulator that models multiple distributions of species over multiple kinds of landscapes and sampling regimes. We add known forms and amounts of uncertainty to the data and apply reserve selection methods . We then simulate follow-on processes such as interactions with urban development. Finally, we apply machine learning methods to try to learn the spatial distribution and amount of error in elements of the reserve selection process and the resulting species outcomes.
Results/Conclusions
Synthetic data allows us to generate "correct" values and then add uncertainty to those values to train the learning algorithms. It also allows us to generate substantial amounts of training data and to generate independent test sets to measure the performance of the full sequence of learning algorithms and optimizers. Using machine learning then allows us to learn a model of process error to better bound the likely performance of reserve selection under uncertainty and to conduct a more efficient sensitivity analysis of the high-dimensional space describing these problems.