7.2 - Camera Math¶

As we discussed in the previous lesson, in WebGL the camera is always located at the origin looking down the -Z axis. The programmer’s job is to create a transformation that moves a scene in front of this stationary camera. You will be able to perform more creative camera work if you understand how this is done. This lesson explains the mathematics behind a camera transformation. Let’s review how a camera is defined.

A Camera Definition¶

A camera is defined by a position and a local coordinate system. We typically call the position of the camera the “eye” position. The camera’s local coordinate system is defined by three orthogonal axes, u, v, and n. If a camera is located at the origin looking down the -Z axis, then u would align with the x axis, v would align with the y axis, and n would align with the z axis. This is summarized as:

u --> x
v --> y
n --> z

We can specify a camera using 12 values which define one global point and three vectors.

eye = (eye_x, eye_y, eye_z)  // the location of the camera
u = <ux, uy, uz>             // points to the right of the camera
v = <vx, vy, vz>             // points up from the camera
n = <nx, ny, nz>             // points backwards; -n is the center of view

The vectors u, v, and n define relative directions because they are pointing in a direction that is relative to the eye‘s location.

Moving a Camera to its Default Location and Orientation¶

Given a camera definition, if we could develop a transformation that moves the camera to the global origin and aligns the camera’s axes with the global axes, then we could apply this transformation to every model in the scene. This would move the scene in front of the camera!

This task is easily accomplished using two separate transformations:

First, move the camera to the origin.
Second, rotate the camera to align the camera’s local coordinate system axes with the global axes.

In matrix format, we have the following, where the first operation is on the right side of the chained transforms:

rotateToAlign
*translateToOrigin
*x
y
z
w
=x'
y'
z'
w'
Eq1

The translateToOrigin transform is trivial to create because we know the eye location. The transform is:

translateToOrigin =1
0
0
0
0
1
0
0
0
0
1
0
-eye_x
-eye_y
-eye_z
1
Eq2

The rotateToAlign transformation is equally simple. (We will develop this transform below.) The transformation is:

rotateToAlign =ux
vx
nx
0
uy
vy
ny
0
uz
vz
nz
0
0
0
0
1
Eq3

Therefore, a transformation that will move a camera to the origin and align the axes is:

ux
vx
nx
0
uy
vy
ny
0
uz
vz
nz
0
0
0
0
1
*1
0
0
0
0
1
0
0
0
0
1
0
-eye_x
-eye_y
-eye_z
1
*x
y
z
w
=x'
y'
z'
w'
Eq4

Perform the matrix math by clicking on the multiplication signs! This is the standard camera transformation used for all 3D computer graphics! (Actually, for all right-handed coordinate system 3D computer graphics.)

Deriving the Rotation Transform¶

Let’s look closer at the rotation matrix that aligns a camera’s axes with the global axes. Remember that the u axis maps to the global x axis, the v axis maps to the global y axis, and the n axis maps to the global z axis. Also remember that a general rotation about an arbitrary axis requires a fractional value in the upper-left 3-by-3 positions of a transformation matrix. Therefore, the desired rotation matrix must satisfy the following three equations:

f1
f4
f7
0
f2
f5
f8
0
f3
f6
f9
0
0
0
0
1
*ux
uy
uz
0
=1
0
0
0
Eq4 - u --> x, or <ux, uy, uz> maps to <1, 0, 0>

f1
f4
f7
0
f2
f5
f8
0
f3
f6
f9
0
0
0
0
1
*vx
vy
vz
0
=0
1
0
0
Eq5 - v --> y, or <vx, vy, vz> maps to <0, 1, 0>

f1
f4
f7
0
f2
f5
f8
0
f3
f6
f9
0
0
0
0
1
*nx
ny
nz
0
=0
0
1
0
Eq6 - n --> z, or <nx, ny, nz> maps to <0, 0, 1>

We need one transform that makes all three equations true. Because of the way matrix multiplication works, it is OK to combine these three separate equations into a single equation like this:

f1
f4
f7
0
f2
f5
f8
0
f3
f6
f9
0
0
0
0
1
*ux
uy
uz
0
vx
vy
vz
0
nx
ny
nz
0
0
0
0
1
=1
0
0
0
0
1
0
0
0
0
1
0
0
0
0
1
Eq7

Notice that the vectors in the three separate equations became the columns of the single matrices. To solve for the rotation matrix, we need to multiply both sides of the equation by the known matrix’s inverse.

LetM^-1
= inverse ofux
uy
uz
0
vx
vy
vz
0
nx
ny
nz
0
0
0
0
1
Eq8

Then,

f1
f4
f7
0
f2
f5
f8
0
f3
f6
f9
0
0
0
0
1
*ux
uy
uz
0
vx
vy
vz
0
nx
ny
nz
0
0
0
0
1
*M^-1
=1
0
0
0
0
1
0
0
0
0
1
0
0
0
0
1
*M^-1
Eq9

This reduces to:

f1
f4
f7
0
f2
f5
f8
0
f3
f6
f9
0
0
0
0
1
=M^-1
Eq10

The rotation matrix we need to align a camera’s local coordinate system to the global coordinate system is:

the inverse ofux
uy
uz
0
vx
vy
vz
0
nx
ny
nz
0
0
0
0
1
Eq11

Mathematicians have proved that if the columns of a matrix are vectors that are orthogonal to each other, the inverse of such a matrix is just its transpose (1). The columns of our matrix are orthogonal because they define a valid right-handed coordinate system where each axis is at a right angle to the other two axes. Therefore, the inverse is trivial to obtain – you interchange the rows and columns.

the inverse ofux
uy
uz
0
vx
vy
vz
0
nx
ny
nz
0
0
0
0
1
=ux
vx
nx
0
uy
vy
ny
0
uz
vz
nz
0
0
0
0
1
Eq12

`lookat` Implementation¶

Below is a JavaScript implementation of the lookat function. It simply implements the math we just discussed. Note that the variables V, center, eye, up, u, v, and n are class objects that were created once when the Learn_webgl_matrix object was created. These objects are reused on each call to lookat.

self.lookAt = function (M, eye_x, eye_y, eye_z, center_x, center_y, center_z, up_dx, up_dy, up_dz) {

  // Local coordinate system for the camera:
  //   u maps to the x-axis
  //   v maps to the y-axis
  //   n maps to the z-axis

  V.set(center, center_x, center_y, center_z);
  V.set(eye, eye_x, eye_y, eye_z);
  V.set(up, up_dx, up_dy, up_dz);

  V.subtract(n, eye, center);  // n = eye - center
  V.normalize(n);

  V.crossProduct(u, up, n);
  V.normalize(u);

  V.crossProduct(v, n, u);
  V.normalize(v);

  var tx = - V.dotProduct(u,eye);
  var ty = - V.dotProduct(v,eye);
  var tz = - V.dotProduct(n,eye);

  // Set the camera matrix
  M[0] = u[0];  M[4] = u[1];  M[8]  = u[2];  M[12] = tx;
  M[1] = v[0];  M[5] = v[1];  M[9]  = v[2];  M[13] = ty;
  M[2] = n[0];  M[6] = n[1];  M[10] = n[2];  M[14] = tz;
  M[3] = 0;     M[7] = 0;     M[11] = 0;     M[15] = 1;
};

Glossary¶

orthogonal: Two vectors are orthogonal if the angle between them is 90 degrees.
maps to: A mapping converts an element into another element.
transpose: An operation on a matrix that swaps rows with columns. Each M[i][j] element moves to the M[j][i] position.
orthogonal matrix: A matrix whose columns (or rows) form vectors that are orthogonal to each other. The inverse of an orthogonal matrix is just its transpose.

Next Section - 7.3 - Camera Movement