A major goal of ecological research is to gain a predictive understanding of the processes governing the dynamics of populations. For this study, we examine time series from the Sir Alister Hardy Foundation for Ocean Science (SAHFOS) Continuous Plankton Recorder (CPR), and ask whether long-term observations are necessary for revealing the underlying nonlinearity in species abundances. Beyond this initial inquiry, we ask if one were to use an appropriate non-parametric nonlinear framework to model ecosystems, to what extent would short-term datasets limit predictive capabilities? Although seemingly trivial, a way to demonstrate this basic result has not been obvious, and understanding the deeper reasons for the advantage of continued data collection has not always been clear. The analytical approach used to approach these questions is one that allows the data to model ecological dynamics with no assumptions about underlying equations – so called empirical dynamic models, EDM. We use two distinct measures to quantify the amount of data in each time series: time series length and data availability. Time series length refers to the number of data points in a time series, whereas data availability is defined as the number of non-zero values within a time series.
For the 90 taxa with the lowest data availability, 11% showed significant nonlinear dynamics. In contrast, among the 28 taxa with the highest data availability, 82% showed significant nonlinear dynamics. In other words, the time series that are most well-observed also show stronger evidence for nonlinear dynamics. To test whether this effect was driven by the taxa that happen to appear most often in the data, we analyzed subsampled time series (data length), finding a similar pattern (p < 0.01; logistic regression, df = 2874). A similar pattern holds for prediction; in general, greater data availability also corresponds to higher forecast skill. However, we note that even at the longest time series lengths, forecast skill can vary substantially. In conclusion, short time series can be challenging to the identification of nonlinear dynamics. Yet data-driven approaches where causal variables and functional relationships are determined empirically may offer a viable alternative. Continued monitoring and longer time series, in conjunction with techniques capable of describing nonlinear behavior will improve our understanding of ecological mechanisms; and unraveling the interdependence between environmental factors and endogenous population dynamics is critical for managing ecosystems in the context of climate change. Long-term data collection will thus have long-term payoffs.