Machine learning to predict new bat reservoirs of filoviruses: Africa and beyond
Using sparse data on bat species in Africa that have been positively identified as carrying filoviruses (such as Ebola virus) we apply a machine learning approach to understand the intrinsic features of bat species that may enable them to be permissive to filovirus infection. Using robust intrinsic trait profiles we also predict bat species with high probabilities of being potentially novel reservoirs for filovirus in Africa, and extend the model to examine whether there are species outside of Africa that may also be permissive to filovirus infection.
Results/Conclusions We find that African bats currently known to carry filoviruses are distinguished from other African bat species by producing more young for their body size compared to other African bat species, which may reflect the tendency for some species to exhibit more than a single birth pulse or litter in a given year. Filovirus reservoir species may also produce neonates that are smaller than expected given the body size of the adults, and reach sexual maturity earlier compared to the neonates of other bat species. We report hotspots of overlapping reservoir species in sub-Saharan Africa, and identify regions where multiple non-African species predicted as possible novel filovirus carriers overlap in geographic range – primarily in southeast Asia and India. More generally, we explore the utility of machine learning for confronting common issues in sparse datasets, for generating new hypotheses, and as a tool for guiding the search for novel disease reservoirs in the wild.