COS 95-7 - Spectral analysis of sparse data via generalized local harmonic regression

Wednesday, August 9, 2017: 10:10 AM
C122, Oregon Convention Center
Timothy H. Keitt, Integrative Biology, The University of Texas at Austin, Austin, TX, Nathaniel Pope, Integrative Biology, University of Texas at Austin, Austin, TX and Alison Northup, Integrative Biology, University of Texas at Austin
Background/Question/Methods

Ecological data are often sparsely sampled in space and time with irregular support and complex boundary conditions. Owing to the hierarchical nature of nature, it is often desirable to quantify scale-specific variation using spectral methods such as wavelets, and indeed many examples exist in the ecological literature. Because wavelets are localized, they are ideal for addressing questions of at what time or in what place and at what scales are signals exhibiting low or high variation or co-variation. Drawbacks to traditional approaches to the wavelet transform are the requirements of regular sampling and, for maximum computational efficiency, a power-of-two number of samples. Ecological data rarely conform to these conditions and as a result data often require pre-processing such as interpolation and edge-corrections. This pre-processing can badly bias outcomes. Here, we propose the method of Generalized Local Harmonic Regression (GLHR), which retains many of the desirable features of the wavelet transform while providing greater flexibility and bias control. Similar to familiar spline interpolation techniques, GLHR is evaluated at user-specified locations, which may over- or under- sample the input signal. Adopting a model-based approach furthermore opens application of a wide variety of techniques, such as model selection, developed in conventional regression applications.

Results/Conclusions

We demonstrate under limiting conditions that GLHR is equivalent to the Morlet and Derivative-of-Gaussian wavelet transforms and show that near boundaries and gaps in sampling, GLHR exhibits less bias than results from the standard wavelet transform. When analyzing a periodic signal at peak power, we show that edge-induced bias can be entirely eliminated, thereby allowing analysis of short signals where edge-effects predominate. We further develop a fully-localized test for signal periodicity utilizing the likelihood ratio and show how local-likelihood estimation generalizes the approach to non-Gaussian error distributions including binomial and Poisson. A challenge in the application of GLHR to sparse data is numerical instability of the matrix inverse leading to large outputs. We implement several standard penalized regression estimators commonly utilized in generalized linear models and show that penalized local-likelihood recovers important signal properties in sparse, non-Gaussian data. We demonstrate the approach on a number of well-known ecological and climactic datasets and discuss a fully-threaded implementation in R leveraging the high-performance Eigen C++ linear algebra library.