A typical Neural Network Structure

For Linear Models, target $Y$ (a number or a vector) are approximated by $$f(X) = W \cdot \hat{X} $$ where $\hat{X}= \begin{pmatrix}1\\X \end{pmatrix}$. A natural way to extend the linear model is to take the composition of a linear model with a non-linear function $\sigma$, $$g(X) =\sigma (W \cdot \hat{X}) $$ Here, $\sigma$ applies entry-wisely to the vector $W \cdot \hat{X}$. This extension gives us a

A

Model Loading