To make it easier to understand, we start with a data point $\mathbf P$ in a $k$-dimensional space spanned by $k$ basis vectors $\mathbf V^k$. Naturally, we could write down the component decomposition of the point using the basis vectors,

This is immediately obvious to us since we have been dealing with rank 2 $(k, 1)$ basis vectors and we are basically talking about the $k$ coordinates for a point.

In fact, this point is represented by a matrix of rank 2 $(k, 1)$ given this basis.

If we assume a cartesian coordinate system, the basis are normal vectors.

These representations of the basis vectors can be joined together

Using these representations of the abstract vectors, we could represent the point $\mathbf P$ using

which is insanely trivial since the representation of basis in this case is an identity matrix.

Spherical Harmonics

NMF is talking about the same idea: coordinate transformations,

We will view $X_{n}^{\phantom{n}k}$ as a vertical stack of $n$ points:

For a point $i$, we have

Compare this with the decomposition of a point in cartesian coordinate system, this is a decomposition of each point on a basis spanned by $\mathbf W^i$.

Another View?

Nonnegative Matrix Factorization

NMF is a dimension reduction method that decomposes $X_{n}^{\phantom{n}k}$ using

while requiring the elements of the decomposition being nonnegative. But there are many possible decompositions! Then we require this approximation to be as accurate as possible.

How do we measure the approximations?

We use the Frobenius distance between the matrix $X_{n}^{\phantom{n}k}$ and $H_n^{\phantom{n}r} W_r^{\phantom{r}k}$,

So NMF will require this Frobenius distance to be minimal.

Why Does It Even Work?

Well, it doesn’t always work. We might have many different NMFs for one single matrix.

Compare with SVD

How do we find a proper decomposition using NMF? We put restrictions on it, such as penalties. Apart from this, we also need to choose the rank of the decomposition $r$ in Eq. ($\ref{eq-nmf}$). Even if we fixed all these problems, NMF is computation expensive.