Emerging cyberinfrastructure and new data sources provide unparalleled opportunities for mobilizing and integrating massive amounts of information from organismal biology, ecology, genetics, climatology, and other disciplines. Key among these data sources is the rapidly growing volume of digitized specimen records from natural history collections. With over 50 million specimen records currently available online, these data provide excellent information on species distributions and changes in distributions over time. Particularly powerful is the integration of phylogenies with specimen data, enabling analyses of phylogenetic diversity in a spatio-temporal context, the evolution of niche space, and more. Such data-driven synthetic analyses may generate unexpected patterns, yielding new hypotheses for further study. However, a major challenge is the heterogeneous nature of complex data, and new methods are needed to link these divergent data types.
Results/Conclusions
Ongoing efforts to link and analyze diverse data are yielding new perspectives on a range of ecological problems. We will present three case studies that address different aspects of ecology and evolutionary biology that have been addressed using specimen data and related heterogeneous data sources. These examples are: (1) analysis of areas of endemism of plant species that serve as hosts for herbivorous insects; (2) extraction of plant functional traits from images of herbarium specimens; and (3) historical and future projections of distributions for coevolving yuccas and yucca moths. Although many specific hypotheses may be addressed through integrated analyses of biodiversity and environmental data, perhaps the greatest value of such data-enabled science will lie in the unanticipated patterns that emerge.