COS 83-8
Context matters in disease control
The optimal choice of control strategy during an epidemic should be sensitive to the context of any given realized outbreak. However, determining how control actions should change throughout an outbreak is difficult given the large number of 1) possible realizations and 2) combinations of possible control actions in an epidemic - the so-called ‘curse of dimensionality’. Epidemiological modelling commonly involves the exploration of stochastic realizations, however, methods for optimization of outcomes over a range of possible management interventions are more challenging.
Using a spatially-explicit simulation model of disease spread, we demonstrate Reinforcement Learning as an optimization routine for generating state-dependent management policies for emergency response to a foot-and-mouth disease outbreak. The policies that are generated are constructed from a valid list of individual management actions, each of which may be relevant at any point in time during an outbreak. Reinforcement learning discovers optimal and novel combinations of control actions by using the simulation model as testing ground for management during simulated outbreaks. A value function, which specifies the utility of each control action for a particular realization of the outbreak, is built-up by the Reinforcement Learning algorithm by repeatedly simulating outbreaks. This value function guides the choice of control actions, so that generated policies are then specific to a particular management objective consistent with the aims of a specific stakeholder.
Results/Conclusions
We illustrate a state-dependent control strategy discovered for a landscape based on the cattle feedlot system in the panhandle of Texas using ring culling and ring vaccination actions across a range of ring radii. The choice of optimal control is dependent on the number of infected farms, and the current resources available to vaccinate (doses) and cull (culling capacity). While the optimal policy is dominated by aggressive culling actions (culling over a wide area), vaccination actions are optimal when the number of available vaccine doses is large. The ability of the state-dependent policy to adapt management to an outbreak in rare but relevant situations where aggressive culling is not optimal means it can outperform any fixed single action when managing a simulated outbreak. Performance improvements of the state-dependent policy are in both mean and variance of outbreak duration. As reinforcement learning techniques are built upon simulation models they have broad relevance for control optimization, not just in foot-and-mouth disease but in many disease control problems.