Estimating sampling sufficiency of network metrics using bootstrap
Sampling the full diversity of interactions in an ecological community is a highly intensive effort, and ecologists have now come to realize that most networks published to date are likely to be under-sampled. Moreover, recent studies have demonstrated that many network metrics are sensitive to sampling effort and network size. Here, we develop a statistical framework that aims to estimate sampling sufficiency for some of the most used network metrics, namely connectance, nestedness (NODF- nested overlap and decreasing fill) and modularity, based on bootstrap methods. Our framework is a resampling technique that can generate confidence intervals for each network metric with increasing sample size (i.e., the number of interaction events sampled), which can be used to evaluate sampling sufficiency. Resampling the data according to the bootstrap method will create a frequency distribution for the network metric of interest in samples with increasing size, mimicking the resampling of the sampling universe. The sample is considered sufficient when the confidence limits reach stability or lie within an acceptable level of precision. We illustrate our framework with data from four quantitative networks of plant and frugivorous birds. Network size in each dataset varied from 16 to 115 species, and 17 to 2,745 interactions.
Our results indicate that sampling sufficiency can be reached at different sample sizes for the same data set depending on the metric of interest. The bootstrap confidence limits for connectance and nestedness reached stability or an accepted level of precision in three, and for modularity two, of the four analyzed networks, and thus the sample was considered sufficient for these metrics. The average value of connectance generated by the bootstrap for the second smallest network reached stability above 100 interaction events but, for nestedness, it became stable after 30 events. This means that increasing the sample size above 30 interaction events did not add new information that could affect the estimate of nestedness. The smallest network did not present sufficiency for any of the metrics, since its confidence limit values were wide and unstable with increasing sample size. The bootstrap method can be useful to empirical ecologists, since it shows the minimum number of interactions necessary to reach sampling sufficiency for a specific network metric. Our method is general enough to be applied to different types of metrics, which is an advantage over classical methods often used in ecological networks.