Estimates of the diversity and structure of assemblages that are reliable across political boundaries and large spatial extents are critical for community ecology, biogeography and conservation. However, these estimates must also be made at high spatial resolution to help inform the study of communities and to implement species management plans informatively and efficiently by local agencies. Yet, today’s estimates of species’ diversity measures for many taxonomic groups are often coarse resolution and based on species’ range products that have not been developed using best-available modeling practices. Here, we illustrate a much improved, reproducible and flexible range modeling workflow that includes data ingestion, cleaning, integration of different types, model fitting, model selection, model evaluation, and visualization. The critical innovations are its scalability to be applied to large species groups in an automated fashion (e.g. 10s of thousands), optimal modeling algorithms are chosen based on available data types, integrating qualitatively different data types, otherwise subjective modeling decisions are made based on empirical evidence across large species groups, and the reduction of bias and variance in range estimates for poorly sampled species by borrowing strength across species.
We demonstrate improved resolution of diversity estimates for 10k South African fynbos plants and 600 New World palms. As part of this robust workflow, we demonstrate novel approaches to model fitting, cross-validation, handling sampling bias, performance assessment, and model visualization. We also explore the particular opportunities offered by new, global scale and high-resolution remote sensing data. We illustrate the implementation and use of product and workflows in Map of Life and chart out our vision for the development and improvement of integrative distribution modelling products at scale and in close support of key questions in ecology, biogeography and conservation.