Multivariate autoregressive (MAR) models have a long history in economics and engineering, and have been successfully applied to long-term data on freshwater plankton to understand trophic interactions and community stability attributes. However past applications have used "high quality" data sets with consistent sampling locations, few or no missing values, and low observation errors. Using a state-space modeling framework, we can add an observation process to the statistical analysis of time-series data. This allows us to analyze data sets with unknown observation error and missing values across multiple spatial locations. Using a 33-year dataset of monthly zooplankton abundances in Lake Washington, we studied the effect of adding missing data, increased time between samples, inclusion of multiple sampling locations, and lumping species on inferences concerning the community dynamics and its stability. The analysis was conducted using an R package we have developed (available on CRAN) for fitting MARSS models via Maximum Likelihood (ML) and Bayesian methods. The ML method is based on our Expectation-Maximization algorithm, which provides robust estimation of MARSS models that are difficult to fit via other numerical approaches. We use both ML and Bayesian analyses. ML methods provide a framework for model selection and for assessing variability of parameter estimates while the Bayesian methods provide a framework for assessing parameter uncertainty and support.
Results/Conclusions
Unsurprisingly, the estimates of species interaction strengths go to zero as the time between samples increases. However, this could be moderated by just a few closely spaced samples per year. Assuming that data do not have observation error when they in fact have high error has severe consequences. Species interactions estimates go to zero while density-dependence estimates goes up as observation error increases. The latter is the well known "apparent" density-dependence problem caused by observation errors. Using the MARSS framework largely eliminates this problem--in exchange for higher parameter variability. Estimates of species interaction strengths are prone to bimodal posterior distributions when data are limited, suggesting that posteriors should be routinely examined to reveal model identifiability problems. While interaction strength estimates can be bimodal, the stability estimates for the community are generally more stable. Finally, spatial replication improves estimation of the observation error variance but cannot in general replace temporal replication (length of the time series).