Local Linearization

The process of local linearization approximates a function near one of its inputs with a simpler function that has the same value at that input, and the same partial derivative values.

Keywords: multivariable calculus, optimization, linear algebra

By Carmen Cincotti Β 

I just keep hitting walls when it comes to my fabric simulation! While reading this article on position based dynamics, I came across the concept of local linearization of nonlinear functions. The author mentioned that:

It is important to notice that PBD also linearizes the constraint function but individually for each constraint. The constraint equation is approximated by:

C(x+βˆ†x)β‰ˆC(x)+βˆ‡C(x)β‹…βˆ†x>=0. C(x+βˆ†x) β‰ˆ C(x) + βˆ‡C(x)Β·βˆ†x >= 0.

The problem, as always, is that this concept was unknown to me. πŸ€¦β€β™‚οΈ

We have already talked about nonlinear functions when discussing the topic of constrained functions in this article. We will continue this discussion in the context of simplifying these functions by linearizing them.

Why should I care? πŸ™‹ Solving non-linear functions can be an expensive task, depending on its complexity. Through linearization, it is possible to partition linear parts of a non-linear function to solve them with much simpler linear equations. This is to solve them more easily and at a lower cost of resources!

⚠️ Although I described linearization of a non-linear function as some sort of magic performance booster, be sure that it makes sense for your application!

Linearization with tangential planes

To start, let’s quickly define the difference between a linear function and a nonlinear function:

  • Linear - a function that makes a straight line when drawn. For example, let’s take a look at this graph together for the function y=2x+1y = 2x + 1:

Graph for the function y=2x+1

Note that these functions take the form:

f(x)=c⏞constant+v⏟vectorβ‹…xf(x) = \overbrace{c}^\textrm{constant} + \underbrace{\mathbf{v}}_\textrm{vector} \cdot \mathbf{x}

For example:

y=2x1+3x2+1y=1+[23]β‹…[x1x2]y = 2x_1 + 3x_2 + 1 \\[6pt] y = 1 + \begin{bmatrix} 2 \\ 3 \end{bmatrix} \cdot \begin{bmatrix} x_1 \\ x_2 \end{bmatrix}

You may also notice that the partial derivatives are constant. For example β€” a function like g(x,y)=ax+by+cg(x,y) = ax + by + c will have constant partial derivatives:

fx=afy=b f_x = a \\[6pt] f_y = b

πŸ’‘ Interestingly, linear functions that do not pass through the origin are technically called affine functions because linear functions must pass through the origin.

  • Nonlinear - these are functions that do not take on straight lines when drawn. Here is the non-linear function y=x2βˆ’1y = x^2 - 1 :

Graph for the function y=x^2-1

Unlike linear functions, the partial derivatives of a non-linear function are not entirely constant. Here are the partial derivatives of a function g(x,y)=x2+y2+1g(x,y) = x^2 + y^2 + 1:

fx=2xfy=2y f_x = 2x \\[6pt] f_y = 2y

Linearization of a nonlinear function can be summarized by saying that we would like to find a plane tangential to a point on a graph of a nonlinear function. This plane function serves as the linear function at that given point.

To clarify, here is the graph for the nonlinear function f(x,y)=x2+y2f(x,y) = x^2+ y^2 (blue) and a tangent plane represented by the function g(x,y)=2(xβˆ’1)+2(yβˆ’1)+2g(x,y) = 2(x-1) + 2(y-1) + 2 (purple) :

Plane tangent to a non-linear function

At the point (1,1)(1, 1) , the two functions intersect. We can say that at the point (1,1)(1,1) the function is linearized as g(x,y)g(x,y).

However, this linearization will get worse the further we get from the point (1,1)(1,1). This is why we should say that the function f(x,y)f(x, y) is linearized locally at the point (1,1)(1, 1).

Since we are now more familiar the concept of linearization, we can examine it more closely. Remember that our goal is to understand this equation:

C(x+βˆ†x)β‰ˆC(x)+βˆ‡C(x)β‹…βˆ†x>=0. C(x+βˆ†x) β‰ˆ C(x) + βˆ‡C(x)Β·βˆ†x >= 0.

The tangent plane

In the last section, I used a function g(x,y)g(x,y) to describe a plane. However, it seems a bit too magical. πŸͺ„

How ​​to find the equation of a plane? To start, here is the equation of a plane in its entirety:

ax+by+cz=d ax + by + cz = d

By using a normal vector and two points that lie on the plane, it is possible to find its equation.

Calculate the equation of a plane

where nβƒ—=(a,b,c)\vec{n} = (a,b,c) and pqβƒ—=(xqβˆ’xp,yqβˆ’yp,zqβˆ’zp)\vec{pq} = (x_q - x_p, y_q - y_p, z_q - z_p). pqβƒ—\vec{pq} lies entirely on the plane.

We know that the dot product between two perpendicular vectors is equal to 0. Which leaves us with the following form n⃗⋅pq⃗=0\vec{n} \cdot \vec{pq} = 0.

(a,b,c)β‹…(xqβˆ’xp,yqβˆ’yp,zqβˆ’zp)=0a(xqβˆ’xp)+b(yqβˆ’yp)+c(zqβˆ’zp)=0 (a, b, c) \cdot (x_q - x_p, y_q - y_p, z_q - z_p) = 0 \\[6pt] a(x_q - x_p) + b(y_q - y_p) + c(z_q - z_p) = 0

Finally, I will rewrite this equation in a form where the vector x0\mathbf{x_0} is the intersection position of the plane and a nonlinear function:

a(xβˆ’x0)+b(yβˆ’y0)+c(zβˆ’z0)=0a(x - x_0) + b(y - y_0) + c(z - z_0) = 0

Keep this equation in mind because we will see it in the next part!

πŸ’‘ This equation is often written where d=ax0+by0+cz0d = ax_0 + by_0 + cz_0 as follows:

ax+by+cz=dax + by + cz = d

Local linearization

Since we now know the equation of a plane - we can add to this knowledge. Suppose we want to approximate the earlier function f(x,y)=x2+y2f(x,y) = x^2+ y^2 by finding its local linear approximation.

To do this, we want to find a function of a plane that is tangent to the function at point (x,y,z)(x,y,z).

The inputs to the function f(x,y)f(x,y) give us the value zz. So we can say that the plane and the function f(x,y)f(x,y) intersect at the point (x,y,f(x,y))(x, y, f(x,y)).

Then, we can rewrite the equation of a plane a(xβˆ’x0)+b(yβˆ’y0)+c(zβˆ’z0)=0a(x - x_0) + b(y - y_0) + c(z - z_0) = 0 to get rid of the variable cc and to separate zz:

ac(xβˆ’x0)+bc(yβˆ’y0)+zβˆ’z0=0A=βˆ’(a/c)B=βˆ’(b/c)z=A(xβˆ’x0)+B(yβˆ’y0)+z0 \frac{a}{c}(x - x_0) + \frac{b}{c}(y - y_0) + z - z_0 = 0 \\[6pt] A = -(a / c) \\[6pt] B = -(b / c) \\[6pt] z = A(x - x_0) + B(y - y_0) + z_0

How shall we continue? πŸ€”

An important characteristic of a tangential curve

Well, the most important thing to remember when talking about tangential curves is that a curve and its tangential counterpart must have the same slope at the point of intersection!

Otherwise, they cannot be tangential. Look at these two 2D graphs:

Tangent vs non-tangent graph

  • The top graph shows two lines that barely intersect at a single point - (βˆ’1.0)(-1.0), which means that the blue curve is tangent to the red curve. βœ…

  • The bottom graph has two intersections which means that the blue line is not tangential to the red curve. β›”

You may be wondering:

How can we find the slope of f(x,y)f(x,y) ? πŸ€”

The slope is the gradient βˆ‡f(x,y)\nabla f(x,y)! Recall that the gradient consists of the partial derivatives of a multivariable function.

Using the partial derivatives of f(x,y)f(x,y) - fxf_x and fyf_y - we can assume that:

A=fx(x0,y0)B=fy(x0,y0)∴z=fx(x0,y0)(xβˆ’x0)+fy(x0,y0)(yβˆ’y0)+z0 A = f_x(x_0, y_0) \\[6pt] B = f_y(x_0, y_0) \\[6pt] \therefore z = f_x(x_0, y_0)(x - x_0) + f_y(x_0, y_0)(y - y_0) + z_0

Finally, we rewrite this equation with the knowledge that the two points intersect at point (x,y,f(x,y))(x, y, f(x,y)) and by setting Lf(x,y)=f(x,y)=zL_f(x,y) = f(x,y ) = z to be clear that the function Lf(x,y)L_f(x,y) is the local linear approximation of f(x,y)f(x,y) :

Lf(x,y)=f(x0,y0)+fx(x0,y0)(xβˆ’x0)+fy(x0,y0)(yβˆ’y0) L_f(x,y) = f(x_0, y_0) + f_x(x_0, y_0)(x-x_0) + f_y(x_0, y_0)(y-y_0)

Function rewrite

Recall the linear function that we used is as follows:

f(x)=c⏞constant+v⏟vectorβ‹…xf(x) = \overbrace{c}^\textrm{constant} + \underbrace{\mathbf{v}}_\textrm{vector} \cdot \mathbf{x}

We can rewrite our function Lf(x,y)L_f(x,y) in this form using x=(x,y)\mathbf{x} = (x, y) and x0=(x0,y0)\mathbf{x}_0 = (x_0, y_0 ):

Lf(x)=f(x0)+βˆ‡f(x0)β‹…(xβˆ’x0) L_f(\mathbf{x}) = f(\mathbf{x}_0) + \nabla f(\mathbf{x}_0) \cdot (\mathbf{x} - \mathbf{x}_0) \\[6pt]

…which is the same form of the constrained function from the introduction, but where x0=x\mathbf{x}_0 = x and x=x+Ξ”x\mathbf{x} = x+\Delta x :

C(x+Ξ”x)β‰ˆC(x)+βˆ‡C(x)β‹…Ξ”x>=0 C(x+\Delta x) β‰ˆ C(x) + βˆ‡C(x)Β·\Delta x >= 0

Resources


Comments for Local Linearization



Written by Carmen Cincotti, computer graphics enthusiast, language learner, and improv actor currently living in San Francisco, CA. Β Follow @CarmenCincotti

Contribute

Interested in contributing to Carmen's Graphics Blog? Click here for details!