A Hands-on Primer for Working with Big Data in R: Introduction to Common Formats & Efficient Data Visualization
Sunday, August 10, 2014: 12:00 PM-5:00 PM
103, Sacramento Convention Center
Ted Hart, National Ecological Observatory Network
Leah A. Wasser, NEON, Inc.
Sarah Elmendorf, National Ecological Observatory Network (NEON)
Katherine M. Thibault, National Ecological Observatory Network (NEON)
Ecologists working across scales and integrating disparate datasets face new challenges to data management and analysis that demand toolkits that go above and beyond the spreadsheet. This workshop will offer ecologists an overview of the variety of data formats and types that are typically encountered when working with ‘Big Data’, and an introduction to available tools in R for working with these formats. The first half of the workshop will introduce participants to these formats, including ASCII, NetCDF4, HDF5, and las. Participants will then learn to (1) access and visualize large datasets in these formats using R, and (2) to use metadata to efficiently integrate datasets from multiple ecological data sources for analysis. In the second half of the workshop, we will apply the knowledge from the first half to work through a practical example of how to integrate field-collected vegetation structure data with remotely sensed LiDAR data. For this example, we will use data collected by the National Ecological Observatory Network (NEON), a continental-scale, NSF-funded effort to collect and freely serve terabytes of data per year (stored in a diversity of formats) over the next 30 years to enable ecological research. Participants will therefore leave the workshop with a basic understanding of the data that NEON and other large projects offer and some basic tools that support the use of Big Data to enhance their own research.