SYMP 4-1 - Exploratory analysis and inference with broad-scale citizen science data

Tuesday, August 7, 2012: 8:05 AM
Portland Blrm 251, Oregon Convention Center
Daniel Fink1, Wesley M. Hochachka1, Theodoros Damoulas2, Jaimin Dave2 and Steve Kelling3, (1)Lab of Ornithology, Cornell University, Ithaca, NY, (2)Department of Computer Science, Cornell University, Ithaca, NY, (3)Information Science, Cornell Lab of Ornithology, Ithaca, NY
Background/Question/Methods

Addressing basic and applied ecological questions across large spatial and temporal extents is becoming more common as access to large-scale environmental and biodiversity data increases and the need to understand large-scale changes to the environment becomes more urgent. Over the past decade researchers have been using newly available data resources to study patterns of species occurrence and abundance across increasingly large spatial and temporal extents at increasingly fine resolution. Analysis of these new data resources has the potential to advance the science of ecology and conservation. However, modeling and deriving inferences from these data present several challenges resulting from features such as: the multi-scale structure and lack of stationarity of the underlying ecological processes, large volumes of data (both in terms of sample size and number of covariates), and often limited a-priori information. 

Results/Conclusions

We will describe the exploration and analysis of dynamic patterns of avian distributions across the continental United States using data from eBird (http://www.ebird.org), an online citizen science bird-monitoring project, and local-scale data on environmental features such as landcover, climate, and vegetation phenology. Limited a-priori information and the potentially large suite of covariates has led us to use semiparametric regression models designed to discover and quantify scale-dependent and spatiotemporally-varying relationships.  The analysis is a two-step process. The first step is the creation of an adaptive predictive model to capture spatiotemporal variation in the response. With a sufficiently accurate predictive model, a second, inferential step is carried out to identify features of potential biological importance in determining distributions and to quantify the spatiotemporal scale and configuration of their effects on the response.

To illustrate this analytical process we estimate the weekly distributions for several terrestrial bird species and compare their migrations (i.e. prediction). Next we estimate seasonal trends in local habitat associations and study how these associations vary spatially (i.e. explanation).  This provides information about the spatial scale and configuration of stationary regions of a species’ distribution, These results are useful for generation of hypotheses and making inference about the processes driving dynamic distributional patterns.