# Neural Networks

## Neural Network

Input Dimension
Number of Hidden Layers
Output Dimension
A typical Neural Network Structure

For Linear Models, target $Y$ (a number or a vector) are approximated by $$f(X) = W \cdot \hat{X}$$ where $\hat{X}= \begin{pmatrix}1\\X \end{pmatrix}$. A natural way to extend the linear model is to take the composition of a linear model with a non-linear function $\sigma$, $$g(X) =\sigma (W \cdot \hat{X})$$ Here, $\sigma$ applies entry-wisely to the vector $W \cdot \hat{X}$. This extension gives us a single-layer neural network, the main goal is to approximate the weights matrix $W$ in the training process.

A multilayer neural network model with $k$ hidden layers is of the form $$X \mapsto (g_k \circ \cdots \circ g_2\circ g_1\circ g_0)(X),\qquad k \in \{0,1,2,3,4,5,\cdots\}$$ The weighs matrices $W_i$ associated to each layer are approximated in the training process.

## Convolutional Neural Network

### Kernels

• Identity Kernel
$$\begin{pmatrix}0&0&0\\0&1&0\\0&0&0\end{pmatrix}$$
• Sharpening Kernel
$$\begin{pmatrix}0&-1&0\\-1&5&-1\\0&-1&0\end{pmatrix} =\begin{pmatrix}0&0&0\\0&1&0\\0&0&0\end{pmatrix} + \begin{pmatrix}0&-1&0\\-1&4&-1\\0&-1&0\end{pmatrix}$$ The kernel amplifies the difference between adjacent pixels.
• Edge Detection Kernel
$$\begin{pmatrix}-1&-1&-1\\-1&8&-1\\-1&-1&-1\end{pmatrix}$$
• Blur Kernel
$$\begin{pmatrix}1&2&1\\2&4&2\\1&2&1\end{pmatrix}$$ Often a Gaussian Kernel times a constant.