# Michael's Wiki

Sparse filtering is an algorithm that performs feature learning, introduced by Sparse Filtering, Ngiam et al 2011. Sparse filtering may converge faster than sparse coding, ICA, and sparse autoencoders for higher-dimension inputs.

#### The Sparse Filtering Algorithm

##### Execution

Execute by optimizing the objective function:

$$minimize \sum_{i=1}^{M} {\left\Vert \hat{f}^{(i)} \right\Vert}_1 = \sum_{i=1}^{M} {\left\Vert \frac{\tilde{f}^{(i)}}{\Vert \tilde{f}^{(i)} \Vert_2} \right\Vert}_1$$

Input is described not as raw data, but as a feature matrix of features per observation.

This describes a single layer of a feature learning matrix. Can apply to repeated layers in a greedy manner as described below to train a deep neural net.

Ngiam et al execute this for their tests using an out-of-the-box optimizer1)2).

##### Interpretation
1. Normalize for uniform activity using $l_2$ norms
2. Normalize per example using $l_2$-ball
3. Normalize for sparseness using $l_1$

See introduction publication for detailed derivations3).

#### Sparsity Properties

Normalizing between features produces implicit competition between features.

• Population Sparsity
• Observations are represented by sparse number of features.