PS 17-3 - Predicting distribution of juvenile and adult mussels in the Upper Mississippi River using random forest models

Wednesday, August 10, 2016
ESA Exhibit Hall, Ft Lauderdale Convention Center
Steve Zigler and Teresa Newton, Upper Midwest Environmental Sciences Center, U.S. Geological Survey, La Crosse, WI
Background/Question/Methods

We analyzed data from a quantitative survey of native mussels that was conducted in a 42-km impounded reach of the Upper Mississippi River (Navigation Pool 18) using a systematic design (n=367 sites).   For each sampling site, we estimated simple physical (water depth, current velocity) and complex hydraulic variables (e.g., shear stress, boundary Reynolds number, relative substrate stability) that have been shown to be useful descriptors of mussel habitat in other studies of the Upper Mississippi River.   Presence-absence of juvenile and adult mussels were analyzed with random forest models.  This ensemble learning method aggregated classification tree submodels (N=1200) based on random selection of predictor variables and data.  To reduce the effect of prevalence on predictions, models were constructed using down-sampled data to balance sample sizes of presence and absence.   Out-of-bag samples were used to estimate generalization error. 

Results/Conclusions

 Receiver operating characteristic curves indicated useful models were constructed for both adult and juvenile mussels.  However, the model for adult mussels performed considerably better (Area under the Curve, AUC=0.81; overall error rate=24%) than the juvenile mussel model (AUC=0.72, overall error rate=36%) indicating greater predictability for adults.   Models primarily depended on complex hydraulic variables including relative substrate stability and boundary Reynolds number.   Results suggested that distribution of juvenile mussels are less closely tied to hydrophysical conditions than adult mussels, and that some mussel habitat might be ephemeral based on recent hydrologic patterns.