COS 75-7 - Quality is everything: Automated data collection tools to enhance the quality of NEON's 'big data' streams

Wednesday, August 9, 2017: 10:10 AM
B116, Oregon Convention Center
Natalie Robinson1, Cody Flagg2 and Kaelin Cawley2, (1)FSU, National Ecological Observatory Network, Boulder, CO, (2)NEON
Background/Question/Methods

The National Ecological Observatory Network (NEON) is a continental scale observatory that will serve massive amounts of atmospheric, organismal, ecohydrological, biogeochemical, and land-cover data to the public over its 30-year lifespan. These data will play a crucial role in studies aimed at linking biodiversity, material cycling and ecosystem services in a changing world, but users must feel confident the data are high quality if they are to be utilized. The procurement of high quality NEON data is particularly challenging within NEON’s Observation Systems (OS), where sampling is performed by hand rather than through automated sensor networks. Such manually collected data are prone to measurement, calculation, and transcription error, and are of compromised quality if they cannot be cleaned and reconciled. In efforts to enhance the quality of its vast quantities of data, NEON actively seeks approaches by which to pre-emptively minimize data collection error and enable rapid back-end quality control and assurance (QAQC).

Results/Conclusions

This presentation describes one approach to enhancing NEON data quality, the development of protocol-specific digital data collection and transcription applications for mobile devices and computers. Application design and development will be discussed, and examples shown to illustrate how: 1) these tools enhance front-end data quality and diminish data loss by forcing entry into required fields, constraining data values and/or measurement ranges, eliminating calculation error, auto-concatenating values such as sample IDs, ensuring that appropriate numbers of samples were collected, etc; 2) the applications provide mechanisms by which to easily retrieve and interact with the data, thereby enabling the implementation of real-time and automated QAQC regimens essential for the rectification of mistakes before critical details are forgotten; 3) digital data collection ensures that numerous data streams are housed in one location and can thus be integrated to maximize data collection efficiency. Though especially useful for big data projects, the mobile data collection applications described here may provide low cost solutions that can be built and utilized by any scientist and act as important data collection and quality assurance tools for any field collection campaign.