OOS 43-2
Management and maintenance of a very large small mammal database in a 25 year live-trapping study in the Chilean semiarid zone

Thursday, August 14, 2014: 1:50 PM
304/305, Sacramento Convention Center
W. Bryan Milstead, U.S. Environmental Protection Agency, Narragansett, RI
Peter L. Meserve, Department of Biological Sciences, University of Idaho, Moscow, ID
Douglas A. Kelt, Department of Wildlife, Fish & Conservation Biology, University of California, Davis, CA
M. Andrea Previtali, Departamento de Ciencias Naturales, Facultad de Humanidades y Ciencias, Universidad Nacional del Litoral, Santa Fe, Argentina
Background/Question/Methods

Data management is a critical, though frequently neglected, step in any research program.  Too often data management is given less importance than other aspects of experimental design and this can lead to problems later on.  As part of a large scale experimental manipulation in the semiarid zone we initiated a small mammal capture-mark-recapture study in Chile in 1989.  Our protocol calls for monthly small mammals inventories on a minimum of 16 0.54 ha. grids.  During 4-day censuses, small mammals are captured in 50 large Sherman traps, marked (if new), and standard population and condition data are recorded.  At any one time, there are 500 traps in operation, and 4000 trap-nights + 3000 trap-days of effort/month.  When we began the project we had no idea that work would last over 25 years (and counting) leading to more than a half million captures of over 81,000 individuals.  Such a dataset poses special challenges for quality control and analysis due to the enormity of the database. 

Results/Conclusions

Data management procedures have evolved during the course of the study as the complexity of the data has increased, and new technology became available.  Initially capture records were stored in spreadsheets, but eventually we moved to a SAS database.  Currently, data are stored in a relational database that allows easy retrieval by statistical programs such as SAS and R.  We have developed strategies for data entry, quality control and quality assurance, version control, error handling, documentation, analysis, and data sharing typical of capture-mark-recapture studies.  Particular problems include reuse of tag numbers, tag changes, and observer errors.  The importance of these problems has increased over the years with the complexity of the database.  Not surprisingly, data management complexity varies with trap success, and in turn, precipitation.  Following high rainfall years we may record thousands of captures (max.=6,615) of thousands of individuals (max.=3,377) per month. With fewer captures each record can be verified, but during irruptions we rely on numerical techniques to ensure quality.  In this talk we give a historical view of our data management and analysis approaches, provide examples of problems and solutions, discuss lessons learned, and provide insights into how to work with datasets of this size.