Hosted on GitHub Pages — Theme by orderedlist
In these notes we cover wavelets, their application and how they relate to other signal decomposition approaches.
The idea of using frequency decomposition as a mode of image processing is found in many applications (e.g. compression, pattern recognition). This was discussed in the Fourier analysis lectures and is strongly related to the multiresolution image processing lecture.
Often, structures whose recognition and analysis are necessary for image-related task (e.g. recognition) are not of a constant scale. A critique of Fourier decomposition is that it allows for the identification of various frequencies but only at a global scale. This can be addressed through use of the windowed Fourier transform; however, localization remains a challenge since the required spatial and frequency samplings are fixed. A multiresolution decomposition offers a coarse-to-fine approach to representing image information. The Laplacian pyramid can be computed quickly, however, there is some redundancy in the information encoded in adjacent pyramid levels. Additionally, there is no orientation selectivity in the decomposition which can inconvenience pattern recognition/analysis. Wavelets, as presented in @mallat1989multifrequency, address these issues.
A wavelet transform is a decomposition onto translated and dilated copies of a function \(\psi(x)\) called a wavelet.
As defined by Morlet, the translated and dilated copies of \(\psi(x)\) are the family of functions \({\psi_s(x - u)}_{(s,u)\in\mathbb{R}^2}\) where \(\psi_s(x) = \sqrt{ s}\psi(sx)\), and the wavelet transform is is expressed by the inner-products
\[Wf(s,u) = \langle f(x), \psi_s(x-u) \rangle.\]These are details you may or may not care about but they offer some (historical) context.
The goal in signal compression is to minimize the amount of information stored without compromising perceptuability. To achieve this, compression is often preceeded by sub-band decomposition (breaking the signal up into frequency bands). This procedure is useful in mitigating the perceptual impact of digitization (sampling) and quantization (rounding) by exploiting knowedge of the perceptual system’s sensitivity to these errors in different frequency bands.
Quadrature mirror filters, @esteban1977application, are a pair of filters used to decompose a signal into two components. These filters, \(h_1\) and \(h_2\) are half-band mirror filters (i.e. \(h_1(x) = h_2(-x)\)) with Fourier transforms \(H_1(e^{j\omega T})\) and \(H_2(e^{\pi-\omega T})\) of equal magnitude. Since each component resulting from the decomposition occupies only half the bandwidth of the original signal, the Nyquist sampling rate of these components is only half that of the original signal.
A multiresolution (pyramidal) decomposition is a method of expressing a \(L^2\)-function as the limit of successive approximations with increasingly concentrated smoothing @daubechies1988orthonormal, the Laplacian pyramid is an example of this. Each level in the pyramid is an approximation or representation of the image at a different scale or resolution. For ease of computation it is common to use a log-scale pyramid (i.e. for an image of dimension \(N^2\) (even), each pyramid level \(j\) has dimension \(N^2*2^{-j}\), \(j>0\))
The construction of a multiresolution representation is formalized in @mallat1989multifrequency, the gist is given here for convenience. For a function \(f(x)\) its multiresolution approximation can be expressed as the sequence \((A_{2^j}f(x)){j\in\mathbb{Z}}\). Each approximation \(A{2^j}f(x)\) is the best approximation of \(f(x)\) in the vectorspace \(V_{2^j}\); \(A_{2^j}f(x)\in V_{2^j} \forall j\). In @mallat1989theory Mallat refers to the sequence \((V_{2^j}){j\in\mathbb{Z}}\) as the *multiresolution approximation* and asserts (in Theorem 1 @mallat1989theory) that for each multiresolution approximation there exists a unique scaling function \(\phi(x)\in L^2(\mathbb{R}^n)\) with \(\phi{2^j} = 2^j\phi(2^j x)\) so that \((\sqrt{2^{-j}}\phi_{2^j}(x - 2^j n)){n\in\mathbb{Z}}\) is an orthormal basis for \(V{2^j}\). Because of this, each approximation can be expressed in the decomposition
\[A_{2^j}f(x) = 2^{-j}\sum_{n\in\mathbb{Z}} \langle f(x), \phi_{2^j}(x - 2^j n)\rangle\, \phi_{2^j}(x - 2^j n) \\ = 2^{-j}\sum_{n\in\mathbb{Z}} (f(x)* \phi_{2^j}(-x)) \,\phi_{2^j}(x - 2^j n)\]The idea of representing a signal by the coefficients of its decomposition is not new to us (e.g. linear algebra, Fourier analysis, etc) but there are some interesting things going on with wavelets.