


Adding a fourth term: (w) makes it easier to work in 3D euclidian space.
Why? Here's a long list of reasons, but a couple key points:
for a point (p):
for a vector (v):
inverse transform: (T^{-1}(t) = T(-t))
When applying individual rotations (yaw, pitch, roll), the order of rotation operations is important. To get around this problem use axis-angle representation.
If you only need to rotate around a single axis, then applying the rotation transform can be straightforward.
Rotation matrices for each axis:
The inverse of a rotation matrix corresponds to its transpose.
Local space, World space, View space, Clip space, Screen space
The relationships between the various spaces - Source: LearnOpenGL.com
The same scene as viewed in the different coordinate systems
perspective / orthographic projection. near plane / far plane. clipping.
You can use the Viewport Transform to get to Screen Space (actual pixels)
To transform a vertex coordinate to clip coordinates:
Coordinate systems we commonly reference for development.
Coordinate Systems for various frameworks
Source: Kyle Simek's excellent computer vision blog
Column 2 of the view matrix is the camera's -Z direction.
Where (u), (v), and (n) are the normalized vectors for the camera referential. (u) is the up vector, (n) is the direction the camera is looking at, and (v) is perpendicular to both (n) and (u).
Get the markdown source for offline reading