Traditionally, the field of ecological modeling was divided into correlative approaches with relatively simple empirical models that were tightly connected to data, and process-based approaches with often more complicated models that were, however, not systematically fit. As computers got more powerful and data more complex, this classical divide has started to fade: statistical modelers increasingly specify mechanism-oriented, hierarchical structures, and process-modelers fit their models more consequently to data. This development at the intersection between process-models and statistics has been welcomed with high expectations: statistically fit mechanistic models promise synthesis of heterogeneous data, more concrete tests of ecological hypotheses, and better predictions. However, substantial theoretical challenges remain for this field. Among the most pressing of those is the question of how to optimize mechanistic models not for the “traditional” purpose of exploring the consequences of known assumptions, but rather for the purpose of model-data integration. I use simulations with existing process-based models to highlight issues in this area.
Simulation results show that existing process-based vegetation models are often too rigid to make them ideal as catalysts of process-knowledge and data. In these models, small errors can accumulate over time to create strong biases in the inference. I demonstrate potential statistical solutions, including the use of partially specified ecological models, and introducing stochastic errors into the processes. The latter creates a state-space structure that provides more stable inferential results in the presence of structural errors. I conclude that much potential exists for optimizing process-based vegetation models specifically for model-data integration, rather than as predictive tools.