PS 49-93 - DataONE: A virtual data center for biology, ecology, and the environmental sciences

Wednesday, August 5, 2009
Exhibit Hall NE & SE, Albuquerque Convention Center
William Michener , DataONE, University of New Mexico, Albuquerque, NM
Suzie Allard , University of Tennessee
Paul Allen , Cornell University
Peter Buneman , University of Edinburgh
Randy Butler , University of Illinois - Urbana Champaign
John Cobb , Oak Ridge National Laboratory
Robert Cook , Environmental Sciences Division & Climate Change Science Institute, Oak Ridge National Laboratory, Oak Ridge, TN
Patricia Cruse , University of California - California Digital Library
Ewa Deelman , University of Southern California
David DeRoure , University of Southampton
Cliff Duke , Ecological Society of America
Mike Frame , U.S. Geological Survey - National Biological Information Infrastructure
Carole Goble , University of Manchester
Stephanie Hampton , National Center for Ecological Analysis and Synthesis, Santa Barbara, CA
Donald Hobern , Atlas of Living Australia
Peter Honeyman , University of Michigan
Jeffery Horsburgh , Utah State University
Viv Hutchison , Center for Biological Informatics, Core Science Systems, US Geological Survey, Denver, CO
Matt Jones , National Center for Ecological Analysis and Synthesis, Santa Barbara, CA
Steve Kelling , Information Science, Cornell Lab of Ornithology, Ithaca, NY
Jeremy Kranowitz , The Keystone Center
John Kunze , University of California - California Digital Library
Bertram Ludaescher , University of California - Davis
Maribeth Manoff , University of Tennessee
Ricardo Pereira , Taxonomic Databases Working Group (Campinas, Brazil)
Line Pouchard , Oak Ridge National Laboratory
Robert Sandusky , University of Illinois - Chicago
Ryan Scherle , National Evolutionary Synthesis Center
Mark S. Servilla , Biology MSC03 2020, University of New Mexico, Albuquerque, NM
Kathleen Smith , National Evolutionary Synthesis Center
Carol Tenopir , University of Tennessee
Dave Vieglais , University of Kansas
Von Welch , University of Illinois - Urbana Champaign
Jake Weltzin , USA National Phenology Network Nat'l Coordinating Office, US Geological Survey, Tucson, AZ
Bruce Wilson , University of Minnesota
Background/Question/Methods

Data about life on earth and the environment are often unavailable or unusable for numerous reasons.  Those data that are available are broadly dispersed and can be difficult to discover and use.  Because of the multiple data and metadata standards employed, integration and analyses have been difficult to achieve. As well, when analyses are completed, sharing and replication of workflows and results pose the next challenge.

DataONE is being designed and constructed to address four key challenges:

1.    Data loss—by preserving at-risk (orphaned) biological/ecological/environmental data from individual scientists

2.    Scattered data sources—by facilitating discovery and access of data through a single easy-to-use portal

3.    Data deluge–by providing a toolbox that empowers scientists and organizations to more easily and effectively manage, analyze, and synthesize data

4.    Poor data practices—by creating an informatics-literate workforce through innovative outreach and training efforts (e.g., best-practice videos, podcasts, on-line certificate programs, downloadable best practice guides and exemplars of data management plans)

Results/Conclusions

DataONE will enable new science and knowledge creation through universal access to data about life on earth and the environment that sustains it.

The system is designed around a nucleus of three existing data centers (coordinating nodes) and a broad array of data holdings such as those maintained by libraries, research networks, and academic and governmental organizations (member nodes). The cyberinfrastructure promotes the discovery and access of data by providing one-stop shopping for data and metadata (information about the data that enables its use) about Earth’s biota and environments.  DataONE provides tools (e.g., metadata management and scientific visualization tools as part of an “investigator’s toolbox”), training, and outreach to scientists and students in a concerted effort enabling and promoting data preservation, data stewardship, and data sharing. Through a series of working group meetings, computer and information scientists are engaged in developing and promulgating ontologies that will facilitate data integration and simplify creation of complex scientific workflows. The DataONE portal simplifies the process of acquiring and using appropriate scientific workflow software like Kepler and Taverna, as well as publishing and sharing new workflows via mechanisms such as myExperiment that allows workflows to be re-used and possibly adopted for other uses.

Copyright © . All rights reserved.
Banner photo by Flickr user greg westfall.