Calculus [3], Spring 2023

This page will mainly be some notes and examples that I hope will be useful to my students. For course information (schedule, homeworks, exams, etc.) please visit the departmental Calculus 3 webpage.

Office hours : MWF 10:00 - 11:00 WH335 (or by appointment)

Vectors and Geometry

Euclidean Spaces

General Spaces

A sphere with radius $r$ and center $C$ is by definition the set of all the points that is $r$ units away from the center $C$. $$ \{x\in \R^n: \dist{x}{C}=r\}= \{x\in \R^n: \sum_{i=1}^n (x_i-c_i)^2=r^2\} $$ In particular, if the center is the origin $\vc{0}$, for brevity, the sphere can also be thought as the set of solutions to the equation $$ \sum_{i=1}^n x_i^2=r^2 $$
Surfaces and Solids
A quadratic curve in $\mathbb{R}^2$ is a curve of the form $$a x^2+2 b x y+c y^2+2 d x+2 f y+g=0 $$ After translation and rotation, the curve can be written in one of the following forms $$ax^2+by^2+c=0 \quad \vee \quad ax^2+by=0$$ In $\mathbb{R}^2$, what is each of the following?
  • $y=0$
    a line
  • $y<0$
    a half plane
  • $x^2-y=0$
    a parabola
  • $3x^2+2y=0$
    a parabola
  • $x^2+y^2-1=0$
    the unit circle
  • $x^2+y^2-1<0$
    the unit disk
  • $1 < x^2+y^2 < 2$
    an annulus
  • $3x^2+4y^2-1=0$
    an ellipse
  • $x^2-y^2+1=0$
    a hyperbola
A quadratic surface in $\mathbb{R}^3$ is a surface of the form $$a x^2+ b x y+ c x z+ d y z+c y^2+ f y z+g z^2+ h x+ i y+ j z+k=0 $$ After translation and rotation, the surface can be written in one of the following forms $$ax^2+by^2+cz^2+ d=0 \quad \vee \quad ax^2+by^2+cz=0$$ In $\mathbb{R}^3$, what is each of the following?
  • $y=1$
    a plane
  • $y<5$
    a half space
  • $x^2+y^2=4, \quad z=3$
    a circle at a height of $3$
  • $x^2+y^2=1$
    a cylinder
  • $3x^2+4y^2=5$
    elliptic cylinder
  • $x^2+y=4$
    parabolic cylinder
  • $x^2-y^2=4$
    hyperbolic cylinder
  • $x^2+y^2+z^2=2$
    a sphere
  • $x^2+y^2+z^2\le2$
    a ball
  • $1\le x^2+y^2+z^2\le2$
    a ball with an empty core
  • $ax^2+by^2+cz^2-1=0$
    an Ellipsoid
  • $x^2+2y^2-7z^2=0$
    a cone
  • $x^2+y^2-z^2=1$
    a hyperboloid of one sheet
  • $-3x^2-5y^2+7z^2=1$
    a hyperboloid of two sheets
  • $11x^2+13y^2-17z=0$
    Elliptic paraboloid
  • $x^2-y^2-2z=0$
    Hyperbolic paraboloid


Definition of a vector
A vector, $v$, can be characterized by any of the following:
Operations on vectors
$$ \begin{aligned} \text{Vector Addition :} && \vc{v}+ \vc{w} &=\left\langle v_{1}+w_{1}, v_{2}+w_{2}, \cdots,v_{n}+w_{n}\right\rangle \\ \text{Scalar Multiplication :} && c\vc{v} &=\left\langle cv_{1}, cv_{2}, \cdots,cv_{n}\right\rangle \\ \text{Norm :} && \left|\vc{v}\right| &=\sqrt{\sum_{i=1}^{n} v_{i}^{2}}\\ {\color{red}\textbf{Dot Product :}} && \vc{v}\cdot \vc{w} &=\sum_{i=1}^{n} v_{i} w_{i}\\ {\color{red} \textbf{Cross Product :}} && \vc{v}\times \vc{w} &=\left\langle v_{2} w_{3}-v_{3} w_{2}, v_{3} w_{1}-v_{1} w_{3}, v_{1} w_{2}-v_{2} w_{1}\right\rangle \end{aligned} $$
Vector addition and scalar multiplication turns $\mathbb{R}^n$ into a vector space over $\mathbb{R}$.

The dot product is a bilinear map from $\mathbb{R}^n\times\mathbb{R}^n$ to $\mathbb{R}$, and the cross product is only defined for $\mathbb{R}^3$. We will discuss these in more detail in the next sections.
Unit Vectors
A unit vector is a vector with length $1$.

The following are the standard basis vectors for $\mathbb{R}^3$, $$ \mathbf{i}=\langle 1,0,0\rangle \quad \mathbf{j}=\langle 0,1,0\rangle \quad \mathbf{k}=\langle 0,0,1\rangle $$
In a n-dimensional space, $n-1$ angles are needed to specify a direction.
For this reason Unit vectors are often used as the more efficient way to define directions.
Given any vector, $v$ we can decompose it into a direction part and a magnitude part as follows: $$\color{green} v = \overbrace{\frac{v}{\|v\|}}^{direction} \quad \overbrace{\|v\|}^{length} $$

Dot and Cross Products

Dot Product

By simple computation, we have the following properties:
If $\theta$ is the angle between the vectors $v,w$ $$ v \cdot w=|v||w| \cos \theta $$ Equivalently, $$ \cos \theta=\frac{ v \cdot w}{|v||w|} $$ It follows that, $v,w$ are perpendicular, i.e. $\theta=\pi/2$ , if and only if $$v\cdot w=0$$ By the Law of Cosines: $$ \|\mathbf{v}-\mathbf{w}\|^2=\|\mathbf{w}\|^2+\|\mathbf{v}\|^2-2\|\mathbf{v}\|\|\mathbf{w}\| \cos \theta $$ Now, observe that: $$ \begin{aligned} & \|\mathbf{v}-\mathbf{w}\|^2=(\mathbf{v}-\mathbf{w}) \cdot(\mathbf{v}-\mathbf{w}) \\ & =\mathbf{v} \cdot \mathbf{v}-2(\mathbf{v} \cdot \mathbf{w})+\mathbf{w} \cdot \mathbf{w} \\ & =\|\mathbf{v}\|^2+\|\mathbf{w}\|^2-2(\mathbf{v} \cdot \mathbf{w}) \\ & \end{aligned} $$ Equating these two expressions for $\|\mathbf{v}-\mathbf{w}\|^2$ gives: $$ \begin{aligned} \|\mathbf{w}\|^2+\|\mathbf{v}\|^2-2\|\mathbf{v}\|\|\mathbf{w}\| \cos \theta & =\|\mathbf{v}\|^2+\|\mathbf{w}\|^2-2(\mathbf{v} \cdot \mathbf{w}) \\ -2(\|\mathbf{v}\|\|\mathbf{w}\| \cos \theta) & =-2(\mathbf{v} \cdot \mathbf{w}) \\ \|\mathbf{v}\|\|\mathbf{w}\| \cos \theta & =\mathbf{v} \cdot \mathbf{w} \end{aligned} $$
When $v,w$ are unit vectors then, $$ v \cdot w=\cos \theta $$ Which is maximized when $\theta=0$, i.e. $v=w$.

Thus the dot product is often used to measure the similarity between two vectors.
Scalar projection of $\mathbf{a}$ onto $\mathbf{b}$: $$\operatorname{comp}_{\mathbf{b}} \mathbf{a}=\underbrace{\frac{\mathbf{b}}{|\mathbf{b}|}}_{direction} \cdot \mathbf{a}$$ This essentially gives the signed length of the projection.
Dot Product $$ \operatorname{comp}_{\mathbf{b}} \mathbf{a}=\|a\|\cos(\theta) = \|\mathbf{a}\|\frac{\mathbf{a} \cdot \mathbf{b}}{\|\mathbf{a}\|\|\mathbf{b}\|} = \frac{\mathbf{a} \cdot \mathbf{b}}{\|\mathbf{a}\|} $$
Vector projection of $\mathbf{a}$ onto $\mathbf{b}:$ $$\operatorname{proj}_{\mathbf{b}} \mathbf{a}= \underbrace{\operatorname{comp}_{\mathbf{b}} \mathbf{a}}_{signed \ length}\quad \underbrace{\frac{\mathbf{b}}{|\mathbf{b}|}}_{direction}=\frac{\mathbf{b} \cdot \mathbf{a}}{|\mathbf{b}|^{2}} \mathbf{b}$$ The first equality is more meaningful, as it gives the length-direction decomposition.
For any non-zero constant $c$ $$\operatorname{proj}_{\mathbf{cb}} \mathbf{a}=\operatorname{proj}_{\mathbf{b}} \mathbf{a} $$ In other words, the projection of $\mathbf{a}$ onto $\mathbf{b}$ is independent of the length of $\mathbf{b}$, only the direction of the projection matters.

The Cross Product

Given two vectors $u,v$, we want to find a vector $w$, that is perpendicular to both $u$ and $v$. This can be done by solving the following equations: $$u\cdot w = u_1w_1+u_2w_2+u_3w_3 = 0 $$ $$v\cdot w = v_1w_1+v_2w_2+v_3w_3 = 0 $$ Which implies $$(u_1v_3-u_3v_1)w_1+(u_2v_3-u_3v_2)w_2=0$$ One trivial solution is to take $$w_1=u_2v_3-u_3v_2,\ w_2 = u_3v_1-u_1v_3, \ w_3=u_1v_2-u_2v_1 $$ The solution give the cross product of $u$ and $v$.
Computational Properties
Often the above the formula is not very intuitive, and difficult to recall. Instead, one often use the Laplace Expansion for determinants: $$\begin{aligned}\mathbf{u} \times \mathbf{v}=\left|\begin{matrix}\mathbf{i}&\mathbf{j}&\mathbf{k}\\ u_1 & u_2 & u_3\\ v_1 & v_2 & v_3\end{matrix}\right| &= \mathbf{i} \left|\begin{matrix} u_2&u_3\\v_2&v_3 \end{matrix} \right|- \mathbf{j} \left|\begin{matrix} u_1&u_3\\v_1&v_3 \end{matrix} \right|+ \mathbf{k} \left|\begin{matrix} u_1&u_2\\v_1&v_2 \end{matrix} \right|\\&=\left(u_2v_3-u_3v_2,u_3v_1-u_1v_3,u_1v_2-u_2v_1\right)\end{aligned} $$ One can check that the followings true, If $\mathbf{a}, \mathbf{b}$, and $\mathbf{c}$ are vectors and $c$ is a scalar, then
Geometric Properties
  1. If $\theta$ is the angle between $\mathbf{a}$ and $\mathbf{b}$, then the length of the cross product $\mathbf{a} \times \mathbf{b}$ is given by $$ |\mathbf{a} \times \mathbf{b}|=|\mathbf{a}||\mathbf{b}| \sin \theta $$ And $|\mathbf{a} \times \mathbf{b}|$ gives the area of the parallelogram determined by the two vectors.
  2. Two nonzero vectors $\mathbf{a}$ and $\mathbf{b}$ are parallel if and only if $$ \mathbf{a} \times \mathbf{b}=\mathbf{0} $$
  3. The volume of the parallelepiped determined by the vectors $\mathbf{a}, \mathbf{b}$, and $\mathbf{c}$ is given by: $$ V=|\mathbf{a} \cdot(\mathbf{b} \times \mathbf{c})| $$
  1. $$ \begin{aligned} |\mathbf{a} \times \mathbf{b}|^2 = & \left(a_1^2+a_2^2+a_3^2\right)\left(b_1^2+b_2^2+b_3^2\right)-\left(a_1 b_1+a_2 b_2+a_3 b_3\right)^2 \\ = & |\mathbf{a}|^2|\mathbf{b}|^2-(\mathbf{a} \cdot \mathbf{b})^2 \\ = & |\mathbf{a}|^2|\mathbf{b}|^2-|\mathbf{a}|^2|\mathbf{b}|^2 \cos ^2 \theta \\ = & |\mathbf{a}|^2|\mathbf{b}|^2\left(1-\cos ^2 \theta\right) \\ = & |\mathbf{a}|^2|\mathbf{b}|^2 \sin ^2 \theta \end{aligned} $$

Lines and Planes

Lines can be represented in various forms
  1. Vector Equation : $$r = \underbrace{r_0}_{placement}+t\underbrace{v}_{direction}$$
  2. Parametric Equations : $$x=x_{0}+a t \quad y=y_{0}+b t \quad z=z_{0}+c t$$
  3. Symmetric Equations :$$\frac{x-x_{0}}{a}=\frac{y-y_{0}}{b}=\frac{z-z_{0}}{c}$$
Line Segments
Given two points $v$ and $w$, the line segment between $v$ and $w$ can be represented by $$r(t)=w(1-t)+vt, \qquad t\in[0,1]$$ As $t$ moves from $0$ to $1$, the point $r(t)$ moves from $w$ to $v$.
To characterize a plane, we need Once we have $p_0$ and $n$, we can represent the plane in various ways
  1. Vector Equation :$$n \cdot \underbrace{(p-p_0)}_{\text{a vector parallel to the plane}}=0$$
  2. Point-Normal Form : $$a\left(x-x_{0}\right)+b\left(y-y_{0}\right)+c\left(z-z_{0}\right)=0$$
  3. General Form : $$a x+b y+c z+d=0$$
Given plane, with point $p_0$ on the plane and normal vector $n$. Take a point $p$, the distance between $p$ and the plane is given by $$D= \underbrace{|\text{comp}_{n} (p-p_0)|}_{\text{length of the projection onto normal vector $n$}} = |\underbrace{\frac{n }{|n|}}_{unit\ normal}\cdot (p-p_0)|$$ Given the coordinates of $p = (x_1,y_1,z_1)$, and the plane has the general form $ax+by+cz+d=0$ then the distance between $p$ and the plane can be simplified to $$D=\frac{\left|a x_{1}+b y_{1}+c z_{1}+d\right|}{\sqrt{a^{2}+b^{2}+c^{2}}}$$

Single Variable Functions

$\quad t \in$
$\quad N =$ (number of sample points)

R--> R

R--> R^n

Space Curve

For vector-valued functions, we will only consider the case where the domain lives in $\mathbf{R}$, i.e. functions of the form $$v(t)= \left(v_1(t),v_2(t),v_3(t)\right),\qquad t\in D \subset \mathbf{R}$$ The function $v(t)$ lives in $\mathbf{R}^4$. For the purpose of visualizing the function, we project it onto $\mathbf{R}^3$ along the $t$-axis, to get the space curve What is the space curve for each of the following ?
  1. $$v(t) = (1+t,5t+2,t)$$
  2. $$v(t) = \cos(t) \mathbf{i}+\sin(t) \mathbf{j}+ t\mathbf{k}$$
Some familiar concepts from calculus I, can be extend easily to vector functions.
The limit of a vector function is defined coordinate-wisely $$\lim_{t\to a} v(t)=\lim_{ t\to a}v(t) = \left(\lim_{ t\to a}v_1(t),\lim_{ t\to a}v_2(t),\lim_{ t\to a}v_3(t) \right)$$ And $v(t)$ is continuous at $a$ if $$\lim_{t\to a}v(t) = v(a)$$ The derivative or tangent vector of a vector function is defined coordinate-wisely $$\frac{dv(t)}{dt}=\left(\frac{dv_1(t)}{dt},\frac{dv_2(t)}{dt},\frac{dv_3(t)}{dt}\right)$$
Differentiation Rules Suppose $\mathbf{u}$ and $\mathbf{v}$ are differentiable vector functions, $c$ is a scalar, and $f$ is a real-valued function. Then
  1. $$\frac{d}{d t}[\mathbf{u}(t)+\mathbf{v}(t)]=\mathbf{u}^{\prime}(t)+\mathbf{v}^{\prime}(t)$$
  2. $$\frac{d}{d t}[c \mathbf{u}(t)]=c \mathbf{u}^{\prime}(t)$$
  3. $$\frac{d}{d t}[f(t) \mathbf{u}(t)]=f^{\prime}(t) \mathbf{u}(t)+f(t) \mathbf{u}^{\prime}(t)$$
  4. $$\frac{d}{d t}[\mathbf{u}(t) \cdot \mathbf{v}(t)]=\mathbf{u}^{\prime}(t) \cdot \mathbf{v}(t)+\mathbf{u}(t) \cdot \mathbf{v}^{\prime}(t)$$
  5. $$\frac{d}{d t}[\mathbf{u}(t) \times \mathbf{v}(t)]=\mathbf{u}^{\prime}(t) \times \mathbf{v}(t)+\mathbf{u}(t) \times \mathbf{v}^{\prime}(t)$$
  6. $$\frac{d}{d t}[\mathbf{u}(f(t))]=f^{\prime}(t) \mathbf{u}^{\prime}(f(t)) $$
The integral of a vector function is defined coordinate-wisely $$\int_{a}^{b}v(t)dt=\left(\int_{a}^{b}v_1(t)dt,\int_{a}^{b}v_2(t)dt,\int_{a}^{b}v_3(t)dt\right)$$ The arc length of from $t=a$ to $t=b$ of $v(t)$ is given by $$s(a,b) = \int_{a}^{b}\underbrace{|v'(t)|}_{\text{length of tangent vector}}dt $$ When fixing some initial point, we can define the arc length function $$s(t) = \int_{a}^{t}|v'(t)|dt$$ This is an increasing function, thus has an inverse, denote it by $t:s\mapsto t(s)$, then $$v(t(s))$$ gives a parametrization of $v$ with respect to arc length.

Motion in Space

Say we have a particle in space, with position represented as a function of $t$ $$p(t)$$ Then the velocity of the particle is given by $$v(t) = p'(t)$$ and speed is given by the norm of the velocity $$speed(t) = |v(t)|$$ Integrating speed over time gives the distance traveled by the particle $$s(a,b) = \int_{a}^{b}speed(t)\ dt$$ take the derivative of the velocity to get the acceleration $$a(t) = v'(t) = p''(t)$$

Bridge between Cal 1 and Cal 3

A typicial Calculus 1 class deal with functions of the form $$f: \mathbb{R}\to \mathbb{R}$$ For the geometric problems in Calculus 1, it is often more convenient to view the function as a vector function $$v: \mathbb{R}\to \mathbb{R}^2$$ $$ x\mapsto (x,f(x))$$ And the tangent vector of $v$ at $t$ $$v'(t) = \left(1,f'(t)\right)$$ gives the direction of the tangent line for $f$ at $t$.

R^n --> R

Limits and Continuity

Contour Plot

We will be mainly deal with functions of the form $$f: \mathbb{R}^2\to \mathbb{R}$$ The extension of most results to $f: \mathbb{R}^n\to \mathbb{R}$ is trivial. The graph of a function $(x,y) \mapsto z$ lives in the 3 dimensional space. One way to visualize the function without requiring too much computational power is to consider the level curves. A level curve of a function $f:\mathbb{R}^2\to \mathbb{R}$ is a curve with equation $$f(x,y)=k$$ where $k$ is a constant, the level of that particular curve. The collection of level curves gives a contour plot of the function.
$x\in $ (, )
$y\in $ (, )


The limit of a function $f:\R^n\to \R$ at the point $\vc{v} $ is $L$, written as $$ \lim _{\vc{x} \rightarrow \vc{v}} f(\vc{x})=L $$ if for every number $\varepsilon>0$ there is a corresponding number $\delta>0$ such that if $ \vc{x} \in D$, $$0 < |\vc{x}-\vc{v}|<\delta \longrightarrow |f(\vc{x})-L|<\varepsilon$$ The function is continuous at $\vc{v}$ if $$\lim_{\vc{x}\to \vc{v} }f(\vc{x})=f(\vc{v})$$ $$\lim f \pm \lim g = \lim (f\pm g)$$ $$\lim cf = c \lim f$$ $$\lim (f g)= (\lim f )( \lim g )$$ $$\lim (f/g)= (\lim f )/( \lim g ),\qquad \lim g \ne 0$$


Directional Derivatives

To talk about the rates of change of a function with respect to a given direction, we define the directional derivatives.
The directional derivative of $f$ at $\left(x_{0}, y_{0}\right)$ in the direction of a unit vector $\mathbf{u}=( a, b)$ is $$ D_{\mathrm{u}} f\left(x_{0}, y_{0}\right)=\lim _{h \rightarrow 0} \frac{f\left(x_{0}+h a, y_{0}+h b\right)-f\left(x_{0}, y_{0}\right)}{h} $$ if this limit exists.

When $\mathbf{u} = (1,0)$, the directional derivative is the partial derivative with respect to $x$ $$D_{(1,0)} f(x,y) = f_x(x,y) = f_1(x,y)$$ When $\mathbf{u} = (0,1)$, the directional derivative is the partial derivative with respect to $y$ $$D_{(0,1)} f(x,y) = f_y(x,y) = f_2(x,y)$$ For higher order partial derivatives, we use the following notations $$ f_{xy} := (f_x )_y \qquad f_{xx} := (f_x )_x$$
Higher order partials are independent of the order of differentiation, Suppose $f$ is defined on a disk $D$ that contains the point $(a, b)$.
If the functions $f_{x y}$ and $f_{y x}$ are both continuous on $D$, then $$ f_{x y}(a, b)=f_{y x}(a, b) $$
Partial derivatives are all that we need, If $f$ is a differentiable function of $x$ and $y$,
then $f$ has a directional derivative in the direction of any unit vector $\mathbf{u}=\langle a, b\rangle$ and $$ D_{\mathrm{u}} f(x, y)=f_{x}(x, y) a+f_{y}(x, y) b $$
Consider $$g: h\mapsto f( x+ah,y+bh)$$ By the chain rule $$\frac{dg}{dh}=g_1 \frac{dx}{dh}+g_2\frac{dy}{dh} = ag_1+bg_2 $$ Conclude by noting that $$ g'(0)=D_{(a,b)}f(x,y) $$
Note that $$ D_{\mathrm{u}} f(x, y)=f_{x}(x, y) a+f_{y}(x, y) b = (f_x,f_u)\cdot \mathrm{u}$$ This gives the motivation to define If $f$ is a function of two variables $x$ and $y$, then the gradient of $f$ is the vector function $\nabla f$ defined by $$ \nabla f(x, y)=\left( f_{x}(x, y), f_{y}(x, y)\right) $$

Chain Rule

We demonstrate the chain rule with examples of increasing complexities.
  1. $$t\mapsto f(x(t),y(t) )$$ $$ \frac{d f}{d t}=\frac{\partial f}{\partial x} \frac{d x}{d t}+\frac{\partial f}{\partial y} \frac{d y}{d t} $$
  2. $$s,t\mapsto f(x(s,t),y(s,t) )$$ $$\frac{\partial f}{\partial s}=\frac{\partial f}{\partial x} \frac{\partial x}{\partial s}+\frac{\partial f}{\partial y} \frac{\partial y}{\partial s},\quad \frac{\partial f}{\partial t}=\frac{\partial f}{\partial x} \frac{\partial x}{\partial t}+\frac{\partial f}{\partial y} \frac{\partial y}{\partial t}$$
  3. $$t_1,t_2,\cdots, t_n \mapsto f(x_1(t_1,\cdots,t_n),\cdots,x_m(t_1,\cdots,t_n) )$$ $$\frac{\partial f}{\partial t_{i}}=\frac{\partial f}{\partial x_{1}} \frac{\partial x_{1}}{\partial t_{i}}+\frac{\partial f}{\partial x_{2}} \frac{\partial x_{2}}{\partial t_{i}}+\cdots+\frac{\partial f}{\partial x_{m}} \frac{\partial x_{n}}{\partial t_{i}}$$
As a corollary, we get an alternative way of doing implicit differentiation. Assume $F(x,y)=0$ defines $y$ implicitly as a function of $x$. Then $$\frac{d y}{d x}=-\frac{F_{x}}{F_{y}}$$ $$ 0=\frac{d}{d x} F(x,y)=F_{x}(x,y)+F_{y}(x,y)\frac{d y}{d x} \implies \frac{d y}{d x}=-\frac{F_{x}}{F_{y}} $$


Linear Approximation
Behavior of a function near a point is approximately linear.
Suppose $f$ has continuous partial derivatives.
An equation of the tangent plane to the surface $z=f(x, y)$ at the point $P\left(x_{0}, y_{0}, z_{0}\right)$ is given by $$ \underbrace{z-z_{0}}_{dz}=f_{x}\left(x_{0}, y_{0}\right)\underbrace{\left(x-x_{0}\right)}_{dx}+f_{y}\left(x_{0}, y_{0}\right)\underbrace{\left(y-y_{0}\right)}_{dy} $$
Rewriting the equation for the tangent plane at $(x_0,y_0,z_0)$,
we get the linear approximation at $(x_0,y_0,z_0)$, $$f(x, y) \approx f(x_0, y_0)+f_{x}(x_0, y_0)(x-x_0)+f_{y}(x_0, y_0)(y-y_0)$$
Gradient Ascent / Descent
  1. $\frac{\nabla f}{|\nabla f|}$ gives the direction that provides the the maximum directional derivative of $f$.
  2. $-\frac{\nabla f}{|\nabla f|}$ gives the direction with the minimum directional derivative of $f$.
  3. The gradient at $\mathbf{v}$, $\nabla f(\mathbf{v})$, is perpendicular to the level curve / surface $f(\mathbf{x})=f(\mathbf{v})$
  1. Given $\mathbf{u}$ a unit vector, note that the following dot product $$D_\mathbf{u} f=\nabla f\cdot \mathrm{u}=|\nabla f | |u| \cos \theta $$ is maximized when $\cos \theta = 1$, i.e. $u$ is in the same direction as $\nabla f$, i.e. $$ \mathrm{u} = \frac{\nabla f}{|\nabla f|} $$ Similarly, the dot product is minimized when $$ \mathrm{u} = -\frac{\nabla f}{|\nabla f|} $$
  2. Let $r(t)$ be a parametric curve living on the level surface, then $$ f(r(t))=f(\mathbf{v})$$ using Chain rule $$f_{1} \frac{dx_1}{dt}+\cdots+f_{n} \frac{dx_n}{dt}=\nabla f \cdot r'(t)=0$$
A physical intepretation : The level surfaces can be view as a stable planes, the best direction to "jump toward" in order to escape the gravity of the stable planes is the perpendicular direction. And this is the direction of the gradient, since it is perpendicular to the stable plane.


Derivative Tests

First Derivative
If then $$f_{x}(a, b)=f_{y}(a, b)=0$$ The conditions $f_{x}(a, b)=0$ and $f_{y}(a, b)=0$ essentially implies that $$D_u f(a,b) =0 \qquad \forall u$$
Second Derivative
  • the second partial derivatives of $f$ are continuous on a disk with center $(a, b)$,
  • and that $f_{x}(a, b)=0$ and $f_{y}(a, b)=0$ .
Let $$ D=D(a, b)=f_{x x}(a, b) f_{y y}(a, b)-\left[f_{x y}(a, b)\right]^{2} $$
  1. If $D>0$ and $f_{x x}(a, b)>0$, then $f(a, b)$ is a local minimum.
  2. If $D>0$ and $f_{x x}(a, b)<0$, then $f(a, b)$ is a local maximum.
  3. If $D<0$, then $(a, b)$ is a saddle point of $f$.
  4. If $D = 0$, it could be a local min / max or a saddle point.
We will prove the more general case of $f: \mathbb{R}^n \to \mathbb{R}$, and this will follow as a special case.
In the more general case of $f: \mathbb{R}^n \to \mathbb{R}$, the second derivative test can be generalized. The Hessian $H_f$ of $f: \mathbb{R}^n \to \mathbb{R}$ is the matrix of second partial derivatives of $f$, more explicitly $$(H_f)_{ij}=f_{x_i}f_{x_j} $$ A matrix $A$ is postive semidefinite if $$ x^TAx \substack{>\\ {\color{red}\mathbf{\_}}} 0 \quad \forall x $$ or equivalently $$ Ax=\lambda x \implies \lambda \substack{>\\ {\color{red}\mathbf{\_}}} 0 $$ negative semidefinite are defined similarly. Let $f(x):\mathbb{R}^n\to\mathbb{R} $, if $f\in C^2$, then at a critical point $x_0$,
  • if $H_f(x_0)$ is positive definite , then $x_0$ is a local minimum,
  • if $H_f(x_0)$ is negative definite, then $x$ is a local maximum,
  • otherwise, $x_0$ is point is saddle point.
$$ \begin{aligned} y=f(\mathbf{x}+\Delta \mathbf{x}) = f(\mathbf{x})+\nabla f(\mathbf{x})^{\mathrm{T}} \Delta \mathbf{x}+\frac{1}{2} \Delta \mathbf{x}^{\mathrm{T}} \mathbf{H}(\mathbf{x}) \Delta \mathbf{x} + \O{\|\Delta \mathbf{x}\|^{3}} \end{aligned}$$
  • If $x_0$ is a critical point, then for $f(x_0+\Delta \mathbf{x}) $, $\nabla f(x_0)=0$
  • and $\mathbf{H}(x_0)$ is positive definite implies $$\Delta \mathbf{x}^{\mathrm{T}} \mathbf{H}(\mathbf{x}) \Delta \mathbf{x}>0$$ for all $\Delta \mathbf{x}\ne \vec{0}$
Global optimum
To guarantee global global optimum, we can either impose a boundedness condition on the domain, or require that the function is convex / concave.
If $f$ is continuous on a closed, bounded set $D$ in $\mathbb{R}^{2}$.

Then there exists a point $\left(x_{0}, y_{0}\right)$ in $D$ such that $$ f(x_0,y_0) \le f(x,y) \quad \forall (x,y) \in D$$ and there exists a point $\left(x_{1}, y_{1}\right)$ in $D$ such that $$ f(x_1,y_1) \ge f(x,y) \quad \forall (x,y) \in D$$
A function $f:\mathbb{R}^n\to\mathbb{R}$ is
  • convex if $H_f(x)$ is positive definite for all $x\in\mathbb{R}^n$.
  • concave if $H_f(x)$ is negative definite for all $x\in\mathbb{R}^n$.
  • If $f$ is convex, then $f$ has a unique global maximum.
  • If $f$ is concave, then $f$ has a unique global minimum.

Lagrange Multipliers

To optimize $f(x,y,z)$ with constraints $g(x,y,z)=k$, assuming
  • absolute minimum / maximum exist
  • $\nabla g\ne 0$ on on $g(x,y,z)=k$
The absolute minimum / maximum of $f$ is achieved at some points $(x,y,z)$ in the solution set of $$\begin{aligned} \nabla f(x, y, z) &=\lambda \nabla g(x, y, z) \\ g(x, y, z) &=k \end{aligned}$$
Consider optimizing the function on curves in the space defined by the constraints.
  • Let $(x_0,y_0,z_0)$ be a critical on the constraints surface.
  • Let $r(t)$ any curve on the surface that passes through $(x_0,y_0,z_0)$ at $t=0$.
Then $$ \nabla f(r(0)) = \nabla f(r(0)) \cdot r'(0) = 0 $$
  • This show that $\nabla f$ is orthogonal to the constraints surface at critical points.
  • And we know $\nabla g$ is orthogonal to the constraints surface at all points on the surface.
thus $\nabla f$ is parallel to $\nabla g$ at critical points, i.e. $$ \nabla f = \lambda \nabla g $$
The effect of constraints $$g(x,y,z)=k$$ is essentially narrowing the domain of $f$ to the solution set of the above equation.

Thus, the problem is really just optimization $f$ within the restricted domain.

If $$g(x,y,z) = (\sqrt{x^2+y^2}-4)^2 + z^2 = 1 $$ Then the problem becomes optimizing $f$ on the surface of a torus.


Double and Triple Integrals

Let $D$ be a region in $\mathbb{R}^{2}$, and let $f$ be a function defined on $D$.

The double integral of $f$ over $D$ is defined as $$ \begin{aligned} \int_{D} f(x, y) d A&= \lim_{n,m \to \infty} \sum_{i=1}^{n} \sum_{j=1}^{m} f\left(x_{i, j}, y_{i, j}\right) \Delta A_{i, j} \\ \int_{D} f(x, y)\ dx\ dy&= \lim_{n \to \infty} \sum_{i=1}^{n}\left( \lim_{m \to \infty} \sum_{j=1}^{m} f\left(x_{i, j}, y_{i, j}\right)\ \Delta x_{i, j} \right) \Delta y_{i, j} \\ \end{aligned} $$ where
  • $\{A_{i,j} \}$ is a minimum cover of the region $D$ by $n*m$ rectangles of equal length and width
  • $(x_{i, j}, y_{i, j})$ is the midpoint of $A_{i, j}$,
  • $\Delta A_{i, j}$ is the area of the rectangle with midpoint $(x_{i, j}, y_{i, j})$,

Triple and higher order integrals are defined similarly.
The Fubini's Theorem allows us to switch the order of integration. If $f$ is continuous on the rectangle $$ R=\{(x, y) \mid a \leqslant x \leqslant b, c \leqslant y \leqslant d\} $$ then $$ \iint_{R} f(x, y) d A=\int_{a}^{b} \int_{c}^{d} f(x, y) d y d x=\int_{c}^{d} \int_{a}^{b} f(x, y) d x d y $$

Change of Variables

$$ x=r \cos \theta \quad y=r \sin \theta $$ $$ r^{2}=x^{2}+y^{2} \quad \theta=\tan^{-1} \frac{y}{x} $$ If $f$ is continuous on a polar rectangle $R$ given by $$0 \leqslant a \leqslant r \leqslant b, \alpha \leqslant \theta \leqslant \beta$$ where $0 \leqslant \beta-\alpha \leqslant 2 \pi$, then $$ \iint_{R} f(x, y) d A=\int_{\alpha}^{\beta} \int_{a}^{b} f(r \cos \theta, r \sin \theta) r d r d \theta $$
$$ x=r \cos \theta \quad y=r \sin \theta \quad z=z $$ $$ r^{2}=x^{2}+y^{2} \quad \tan \theta=\frac{y}{x} \quad z=z $$ Suppose that $f$ is continuous on $$ E=\left\{(x, y, z) \mid(x, y) \in D, u_{1}(x, y) \leqslant z \leqslant u_{2}(x, y)\right\} $$ where $D$ is given in polar coordinates by $$ D=\left\{(r, \theta) \mid \alpha \leqslant \theta \leqslant \beta, h_{1}(\theta) \leqslant r \leqslant h_{2}(\theta)\right\} $$ Then $$\iiint_{E} f(x, y, z) d V=\int_{\alpha}^{\beta} \int_{h_{1}(\theta)}^{h_{2}(\theta)} \int_{u_{1}(r \cos \theta, r \sin \theta)}^{u_{2}(r \cos \theta, r \sin \theta)} f(r \cos \theta, r \sin \theta, z) r d z d r d \theta$$
$$(\rho, \theta,\phi)\to \begin{cases}z&=\rho \cos(\phi)\\ \underbrace{r}_{shadow}&=\rho\sin(\phi) \end{cases}\to \begin{cases} z&=\rho \cos(\phi) \\ y&=\rho\sin(\phi)\sin(\theta)\\ x&=\rho\sin(\phi)\cos(\theta) \end{cases}$$ Where $\rho^2=x^2+y^2+z^2$ If $f$ is continuous on $$ E=\{(\rho, \theta, \phi) \mid a \leqslant \rho \leqslant b, \alpha \leqslant \theta \leqslant \beta, c \leqslant \phi \leqslant d\} $$then $$ \iiint_{E} f(x, y, z) d V =\int_{c}^{d} \int_{\alpha}^{\beta} \int_{a}^{b} f(\rho \sin \phi \cos \theta, \rho \sin \phi \sin \theta, \rho \cos \phi) \rho^{2} \sin \phi d \rho d \theta d \phi $$
All of these transformations are essentially special cases of the change of variables theorem. We will cover the general case in the next section, where we introduce the Jacobian and Jacobian determinant.

Vector Calculus

R^n --> R^m

Jacobian and Change of Variables

Let $f:\R^n\to\R^m$ be a differentiable function. The Jacobian matrix of $f$ is the matrix $$ J_f= \begin{pmatrix} \frac{\partial f_1}{\partial x_1} & \cdots & \frac{\partial f_1}{\partial x_n}\\ \vdots & \ddots & \vdots\\ \frac{\partial f_m}{\partial x_1} & \cdots & \frac{\partial f_m}{\partial x_n} \end{pmatrix} $$ The Jacobian determinant of $f$ is the determinant of the Jacobian matrix $$ \det J_f$$ For simplicity, the notation $$\dfrac{\partial f}{\partial x}=J_f$$ will also be used.
Chain Rule :

If $f: \R^n \to \R^m$ and $g: \R^m \to \R^w$ are differentiable, then for $$h:\mathbf{v}\mapsto g(f(\mathbf{v}))$$ $$ J_h(\mathbf{v})= J_g(f(\mathbf{v})) J_f(\mathbf{v}) $$
Let then $$\int_D F(\vc{x}) d\vc{x}=\int_{D'} F(T(\vc{v})) |\det J_T(\vc{v})| d\vc{v}$$
If then there exist an open set $U$ containing $p$ such that, on $U$

Vector Fields

A vector field is a map $$F:\R^n\to\R^n$$ If a vector $F=\nabla f$ for some scalar function $f$, $F$ is called a conservative vector field. For $f:\R^n\to \R$, the following map defines a vector field $$F:\mathbf{v}\mapsto \nabla f(\mathbf{v})$$
Operators on Vector Field
The divergence of a vector field $F$ is the scalar function $$\operatorname{div} F = \nabla \cdot F $$ The curl of a vector field $F$ is the vector function $$\operatorname{curl} F = \nabla \times F $$
  1. If $f\in C^2[\R^3]$, then $$ \operatorname{curl}(\nabla f)=\mathbf{0} $$
  2. If $\mathbf{F}\in C^1(\R^3,\R^3)$ and curl $\mathbf{F}=\mathbf{0}$, then $\mathbf{F}$ is a conservative vector field.
  3. If $\mathbf{F}\in C^2(\R^3,\R^3)$, then $$ \operatorname{div} \operatorname{curl} \mathbf{F}=0 $$

Path Integral

Path Integral for real-valued functions

If $f$ is defined on a smooth curve $C$ given by $$x=x(t),\qquad y=y(t),\qquad t\in [a,b]$$ the path integral of $f$ along $C$ is defined by the following limit (if it exist)$$\int_{C} f(x, y) d s=\lim _{n \rightarrow \infty} \sum_{i=1}^{n} f\left(x_{i}^{*}, y_{i}^{*}\right) \Delta s_{i}$$ If $f$ is continuous on the smooth curve $C$, and $r(t),\ t\in [0,1]$ is a parametrization of $C$, then $$\begin{aligned} \int_{C} f(x, y) d s&= \int_0^1 f(r(t)) \underbrace{|r'(t)| \ dt}_{ds}\\ &=\int_{0}^{1} f(x(t), y(t)) \sqrt{\left(\frac{d x}{d t}\right)^{2}+\left(\frac{d y}{d t}\right)^{2}} d t \end{aligned}$$

Path Integral for Vector Fields

We will assume $\mathbf{F}$ is a continuous vector field defined on a smooth curve $C$ given by a vector function $\mathbf{r}(t), a \leqslant t \leqslant b$.
The path integral of $\mathbf{F}$ along $C$ is $$ \int_{C} \mathbf{F} \cdot d \mathbf{r}=\int_{a}^{b} \mathbf{F}(\mathbf{r}(t)) \cdot \mathbf{r}^{\prime}(t) d t $$ If $ \mathbf{F} =\nabla f$ for some function $f$, then $$\mathbf{F}(\mathbf{r}(t)) \cdot \mathbf{r}^{\prime}(t)=\nabla f(\mathbf{r}(t)) \cdot \mathbf{r}^{\prime}(t)$$ is essentially the rate of change of $f$ at time $t$ when travel along the path.
If $\mathbf{F}=P \mathbf{i}+Q \mathbf{j}$, then $$ \begin{aligned} \int_{C} \mathbf{F} \cdot d \mathbf{r}&=\int_{a}^{b} \mathbf{F}(\mathbf{r}(t)) \cdot \mathbf{r}^{\prime}(t) d t& \color{blue} =\int_{a}^{b} \mathbf{F}(\mathbf{r}(t)) \cdot \underbrace{\mathbf{T}}_{\frac{r'(t)}{|r'(t)|}}\ \underbrace{ds}_{|r'(t)|dt} \\ &=\int_{a}^{b} P(x,y) x'(t) dt + Q(x,y) y'(t) dt\\ &=\int_{C}P(x,y) dx + Q(x,y) dy \end{aligned} $$ If $P_y=Q_x$, then $F$ is conservative.
If $F$ is conservative, and $C$ is a smooth curve with initial point $A$ and terminal point $B$, then $$\int_{C} \mathbf{F} \cdot d \mathbf{r}= f(B)-f(A)$$

Green's Theorem

If $C$ is positively oriented, piecewise-smooth, simple closed curve in the plane, then $$ \int_{C} \mathbf{F} \cdot d \mathbf{r}= \iint_D (Q_x - P_y) d A $$ The Green's Theorem is a special case of Stokes' Theorem.

Surface Integral

\begin{aligned} \iint_{S} f(x, y, z) d S&=\lim _{m, n \rightarrow \infty} \sum_{i=1}^{m} \sum_{j=1}^{n} f\left(P_{i j}^{*}\right) \Delta S_{i j}\\ &=\iint_{D} f(\mathbf{r}(u, v))\left|\mathbf{r}_{u} \times \mathbf{r}_{v}\right| d A \end{aligned} We will assume $\mathbf{F}$ is a continuous vector field defined on a smooth oriented surface $S$ with unit normal vector $\vc{n}$.
The surface integral of $\mathbf{F}$ over $S$ is $$ \iint_{S} \mathbf{F} \cdot d \mathbf{S}=\iint_{S} \mathbf{F} \cdot \mathbf{n} d S=\iint_{D} \mathbf{F} \cdot\left(\mathbf{r}_{u} \times \mathbf{r}_{v}\right) d A $$ This integral is also called the flux of $\mathbf{F}$ across $S$. If $C$ the a positively oriented, piecewise-smooth, simple closed boundary curve of $S$, then $$ \int_{C} \mathbf{F} \cdot d \mathbf{r}= \iint_{S} \text{curl } \mathbf{F} \cdot d \mathbf{S}$$

Geometry of divergence and curl

$$ \left.\operatorname{div} \mathbf{F}\right|_{\mathrm{x}_0}=\lim _{V \rightarrow 0} \frac{1}{|V|} \iint_{S(V)} \mathbf{F} \cdot \hat{\mathbf{n}} d S $$

Complex Analysis

$$\begin{aligned} z&=r(\cos \theta + i \sin \theta) \quad &&\color{green}\text{Euler's Formula}\\ &= r e^{i \theta} \quad &&\color{green}\text{Polar Form} \end{aligned}$$ Functions $$f: \C\to C$$ $$f(x+iy)=u+vi$$ can be viewed as functions of the form $$f':\R^2\to \R^2$$ $$(x,y)\mapsto (u(x,y),v(x,y))$$


$$e^z:= \sum_{n=0}^{\infty} \frac{z^n}{n!}$$


A function $f:\C \to \C$ is said to be holomorphic on a open set $U$, if $$f'(z)=\lim_{\epsilon \to 0} \frac{f(z+\epsilon)-f(z)}{\epsilon} \quad \text{exists } \forall z\in U$$ If $f= u+iv$ is holomorphic on $U$, then $$\frac{\partial u}{\partial x}=\frac{\partial v}{\partial y}, \qquad \frac{\partial u}{\partial y}=-\frac{\partial v}{\partial x}$$ On the other hand, if then $f$ is holomorphic on $U$.


Let $f:\C \to \C$ be holomorphic on a simply connected domain $U$. Then $$\oint_{C} f(z) dz= 0 $$ for any closed curve $C$ in $U$. This is a corollary follows from Green's Theorem . $$ \begin{aligned} \oint_{C} f(z) dz&=\oint_{C} \left(u+iv\right) dz\\ &=\oint_{C} (u+iv) (dx+idy )\\ &=\oint_{C} (u+iv) dx+ (iu-v) dy\\ &=\iint_D (iu_x-v_x)- (u_y+iv_y) dA \qquad \color{green}\text{Green's Theorem}\\ &=\iint_D 0 dA \qquad \color{red}\text{Cauchy Reimann Eqs}\\ &=0 \end{aligned} $$
  • Let $f:U \to \C$ be holomorphic on a open subset $U\subset \C$.
  • Let $D$ be a open disk in $U$ with boundary $\gamma$.
Then $$ f^{(n)}(a) = \frac{n!}{2\pi i} \oint_{\gamma} \frac{f(z)}{(z-a)^n} dz ,\qquad \forall a\in D $$

Measure Theory


Measurable Spaces

A $\sigma$-field on $X$ is a collection $$ \mc{A}\subseteq 2^X $$ that satisfies the following properties: $(X, \mc{A})$ is called a measurable space. If $\mc{A} \subset 2^X$ is a $\sigma$-field, then


A measure on a measurable space $(X,\mc{A})$ is a function $$ \mu: \mc{A} \to [0,\infty] $$ that satisfies the following properties:

Measurable Maps

A measurable map $f: (X,\mc{A}) \to (Y,\mc{B})$ is a function $$ f: X \to Y $$ such that $$ f^{-1}(B) \in \mc{A} $$ for all $B\in \mc{B}$.


Simple Functions

A function $f: X \to \bar{\R}$ is called a simple function if $$ f(X) = \{a_1, a_2, \dots, a_n\} $$ for some $a_1, a_2, \dots, a_n \in \R$. The integral of a simple function $f$ is defined as $$ \int_X f d\mu = \sum_{i=1}^{n} a_i \mu(f^{-1}(a_i)) $$