**Background/Question/Methods**

Overarching perspectives in statistical hypothesis testing and model evaluation (frequentist/information-theoretic/Bayesian) have led to methods that vary widely with respect to their underlying philosophy and intended purpose. Unfortunately, these constraints are often poorly understood by ecologists, leading to model misapplication and misinterpretation. For example, many theoretical frameworks in ecology are attempts to mathematically depict *particular* natural conditions; for instance, “default”, or “no effect” patterns. This is done because an explicit effect (often 0) can be set for H_{0}, whereas H_{A} can define only “some effect” distinct from H_{0}. As a result, many ecologists have quantified the validity of said models/predictions by setting them as null hypotheses in frequentist significance tests. Frequentist significance tests, however, no not allow empirical confirmation of null (or alternative) hypotheses. Instead, under the conventional severe-falsificationist framework we “reject” or *provisionally* “fail to reject H_{0.}”

We can compare the perspectives of widely disparate statistical methods using the likelihood ratio test statistic, *X*^{2} (two times the difference in H_{A} and H_{0} log-likelihoods). Under a conventional significance testing perspective the line of demarcation between H_{0} and H_{A} is the critical value *X*^{2} = 1.96^{2}. Conversely, for *AIC* and *BIC* this line is *X*^{2} = 2, and *X*^{2 }= log(*n*), respectively. For practical and comparative purposes, I. J. Good’s “Bayes/non-Bayes compromise” has the demarcation line *X*^{2} = Φ^{-1}{1 – 0.25/ (*n*)^{0.5}}, where Φ^{-1} (*p*) denotes the probit function at probability *p*. These lines can be used to graphically demonstrate the divergence of the aforementioned methods as *n* increases.

By considering effect size, and thus defining the distribution of *X*^{2} under H_{A}, our approach can also be used to intuitively demonstrate the* strong consistency* of *BIC *and Good’s compromise in model selection. Strong consistency requires that as sample size approaches infinity the true model, from a group of models, will be selected. Our approach can also be extended to consider the behavior of metrics for models with widely differing numbers of parameters.

**Results/Conclusions**

** **The demarcation lines/surfaces for significance testing, *AIC*, *BIC*, and Good’s compromise represent locations along a conceptual continuum of hypothesis refutation/confirmation. *AIC* and significance testing do not consider sample size, thus as sample size grows large these methods will reject H_{0} with probability 1. These methods are therefore strongly *refutative*. On the other hand, BIC and Good's compromise demand more evidence against H_{0} for rejection as sample size increases. These methods were intended to *confirm* the correct hypothesis, H_{A} or H_{0}.