COS 131-6
iPlant Cyberinfrastructure enables large-scale and big-data ecological research

Friday, August 15, 2014: 9:50 AM
Regency Blrm A, Hyatt Regency Hotel
Ramona L. Walls, iPlant Collaborative, University of Arizona, Tucson, AZ
Background/Question/Methods

The iPlant Collaborative (http://www.iplantcollaborative.org/) is an NSF-funded initiative with the mission to facilitate the transformation of life sciences research and education by providing the computing infrastructure and expertise needed to answer biological questions that were previously difficult or impossible to address. During the first five years of the iPlant project, the focus was on infrastructure for two grand challenges: genotype to phenotype mapping in plants and building the green plant tree of life. Now that the major pieces of infrastructure to support these grand challenges are in place, iPlant’s scope has expanded to support any non-medical life sciences research, be it in plants, animals, or microbes. iPlant is well situated to provide cyberinfrastructure for ecological disciplines that require access to very large data sets or high performance computing.

Results/Conclusions

iPlant’s infrastructure supports numerous single-lab and community projects, such as the Botanical Information and Ecology Network (BIEN), which has calculated range maps for almost 90,000 plant species; the One Thousand Plants (1KP) project, which sequenced over 1000 plant genomes; and the iMicrobe project, which strives to democratize access to microbial data processing pipelines and advance studies in microbial ecology. This presentation will provide an overview of the tools and services available through iPlant. These include: large-scale data storage, sharing, and metadata mark-up via the iPlant Data Store, cloud-based computing through Atmosphere, web-based access to dozens of applications through the Discovery Environment, iPlant Application Programming Interfaces (APIs), an image management and analysis system with a high performance computing back-end (Bisque), access to high-resolution environmental layers, and educational and training resources. iPlant’s flexible, open-source architecture should be of interest to anyone who needs to organize and analyze very large data sets, is using genomic or metagenomic methods to address ecological questions, or is developing ecological models that require large memory or parallel computations.