OOS 33-2
Machine learning to predict new bat reservoirs of filoviruses: Africa and beyond

Tuesday, August 11, 2015: 1:50 PM
344, Baltimore Convention Center
Barbara Han, Cary Institute of Ecosystem Studies
John Paul Schmidt, Odum School of Ecology, University of Georgia, Athens, GA
David T. S. Hayman, Institute of Veterinary, Animal and Biomedical Sciences, Massey University, Palmerston North, New Zealand
Sarah E. Bowden, Odum School of Ecology, University of Georgia, Athens, GA
John M. Drake, Odum School of Ecology, University of Georgia, Athens, GA
Background/Question/Methods Identifying global-scale patterns and drivers of infectious diseases is an increasingly important goal for disease ecology, with an ultimate aim to forecast disease emergence or predict potentially undiscovered wild reservoirs of infection in advance of a disease event. These aims depend fundamentally on data, which, in addition to sparsity, can often be rife with hidden interactions, collinearities, and nonrandom patterns of missingness.

Using sparse data on bat species in Africa that have been positively identified as carrying filoviruses (such as Ebola virus) we apply a machine learning approach to understand the intrinsic features of bat species that may enable them to be permissive to filovirus infection. Using robust intrinsic trait profiles we also predict bat species with high probabilities of being potentially novel reservoirs for filovirus in Africa, and extend the model to examine whether there are species outside of Africa that may also be permissive to filovirus infection. 

Results/Conclusions We find that African bats currently known to carry filoviruses are distinguished from other African bat species by producing more young for their body size compared to other African bat species, which may reflect the tendency for some species to exhibit more than a single birth pulse or litter in a given year. Filovirus reservoir species may also produce neonates that are smaller than expected given the body size of the adults, and reach sexual maturity earlier compared to the neonates of other bat species. We report hotspots of overlapping reservoir species in sub-Saharan Africa, and identify regions where multiple non-African species predicted as possible novel filovirus carriers overlap in geographic range – primarily in southeast Asia and India. More generally, we explore the utility of machine learning for confronting common issues in sparse datasets, for generating new hypotheses, and as a tool for guiding the search for novel disease reservoirs in the wild.