Wavelets

In these notes we cover wavelets, their application and how they relate to other signal decomposition approaches.

motivation

The idea of using frequency decomposition as a mode of image processing is found in many applications (e.g. compression, pattern recognition). This was discussed in the Fourier analysis lectures and is strongly related to the multiresolution image processing lecture.

Often, structures whose recognition and analysis are necessary for image-related task (e.g. recognition) are not of a constant scale. A critique of Fourier decomposition is that it allows for the identification of various frequencies but only at a global scale. This can be addressed through use of the windowed Fourier transform; however, localization remains a challenge since the required spatial and frequency samplings are fixed. A multiresolution decomposition offers a coarse-to-fine approach to representing image information. The Laplacian pyramid can be computed quickly, however, there is some redundancy in the information encoded in adjacent pyramid levels. Additionally, there is no orientation selectivity in the decomposition which can inconvenience pattern recognition/analysis. Wavelets, as presented in @mallat1989multifrequency, address these issues.

And now wavelets

what are wavelets?

A wavelet transform is a decomposition onto translated and dilated copies of a function \(\psi(x)\) called a wavelet.

As defined by Morlet, the translated and dilated copies of \(\psi(x)\) are the family of functions \({\psi_s(x - u)}_{(s,u)\in\mathbb{R}^2}\) where \(\psi_s(x) = \sqrt{ s}\psi(sx)\), and the wavelet transform is is expressed by the inner-products

\[Wf(s,u) = \langle f(x), \psi_s(x-u) \rangle.\]

where do they come from?

These are details you may or may not care about but they offer some (historical) context.

Quadrature mirror filters

The goal in signal compression is to minimize the amount of information stored without compromising perceptuability. To achieve this, compression is often preceeded by sub-band decomposition (breaking the signal up into frequency bands). This procedure is useful in mitigating the perceptual impact of digitization (sampling) and quantization (rounding) by exploiting knowedge of the perceptual system’s sensitivity to these errors in different frequency bands.

Quadrature mirror filters, @esteban1977application, are a pair of filters used to decompose a signal into two components. These filters, \(h_1\) and \(h_2\) are half-band mirror filters (i.e. \(h_1(x) = h_2(-x)\)) with Fourier transforms \(H_1(e^{j\omega T})\) and \(H_2(e^{\pi-\omega T})\) of equal magnitude. Since each component resulting from the decomposition occupies only half the bandwidth of the original signal, the Nyquist sampling rate of these components is only half that of the original signal.

Multiresolution decomposition

A multiresolution (pyramidal) decomposition is a method of expressing a \(L^2\)-function as the limit of successive approximations with increasingly concentrated smoothing @daubechies1988orthonormal, the Laplacian pyramid is an example of this. Each level in the pyramid is an approximation or representation of the image at a different scale or resolution. For ease of computation it is common to use a log-scale pyramid (i.e. for an image of dimension \(N^2\) (even), each pyramid level \(j\) has dimension \(N^2*2^{-j}\), \(j>0\))

The construction of a multiresolution representation is formalized in @mallat1989multifrequency, the gist is given here for convenience. For a function \(f(x)\) its multiresolution approximation can be expressed as the sequence \((A_{2^j}f(x)){j\in\mathbb{Z}}\). Each approximation \(A{2^j}f(x)\) is the best approximation of \(f(x)\) in the vectorspace \(V_{2^j}\); \(A_{2^j}f(x)\in V_{2^j} \forall j\). In @mallat1989theory Mallat refers to the sequence \((V_{2^j}){j\in\mathbb{Z}}\) as the *multiresolution approximation* and asserts (in Theorem 1 @mallat1989theory) that for each multiresolution approximation there exists a unique scaling function \(\phi(x)\in L^2(\mathbb{R}^n)\) with \(\phi{2^j} = 2^j\phi(2^j x)\) so that \((\sqrt{2^{-j}}\phi_{2^j}(x - 2^j n)){n\in\mathbb{Z}}\) is an orthormal basis for \(V{2^j}\). Because of this, each approximation can be expressed in the decomposition

\[A_{2^j}f(x) = 2^{-j}\sum_{n\in\mathbb{Z}} \langle f(x), \phi_{2^j}(x - 2^j n)\rangle\, \phi_{2^j}(x - 2^j n) \\ = 2^{-j}\sum_{n\in\mathbb{Z}} (f(x)* \phi_{2^j}(-x)) \,\phi_{2^j}(x - 2^j n)\]

why do we care about them?

The idea of representing a signal by the coefficients of its decomposition is not new to us (e.g. linear algebra, Fourier analysis, etc) but there are some interesting things going on with wavelets.

By construction a wavelet transform is a decomposition onto frequency bands with the same size on a log scale.
Wavelets form an orthonormal basis for \(L^2(\mathbb{R}^n)\) so the decomposition is not redundant.
The zero-crossing wavelet representation is translation equivariant

how do we use them

tags: Machine Perception

Christine Allen-Blanchette

CV

About

Reading group

Course materials

ML4PEARL