The 3D coordinate system in OpenGL is generally assumed to be
right-handed, i.e. +x to the right, +y to the top, +z towards the
viewer. I assume that this doesn't really need to be coded anywhere in
the OpenGL implementation, though.


Transformations in general:
===========================

object coords --(modelview matrix)---> eye coords --(projection matrix)--> device coords.[1] --(viewport params)--> 2D viewport (pixel) coords

Here, object coordinates are the original object coordinates passed in
via glVertex* calls or vertex buffers etc., eye coordinates are
corresponding coordinates in an orthogonal coordinate system whose
origin is located in the viewer's eye (hence the name I assume), whose
x axis points to the right, y axis to the top, z axis towards the
viewer (so the viewing direction is "down the z axis", i.e. in -z
direction). Device coordinates should generally fall within a cube of
side length 2 around the origin, but otherwise already correspond to
viewport (i.e. 2D output window) coordinates, which means they will
already include the projection and thus account for things like a
perspective or parallel projection. Finally, the viewport coordinates
will be the actual output window (x,y) coordinates, with the z
component corresponding to depth buffer depth values.


Viewport Parameters
===================

viewport parameters are x, y, width, height, nearVal, farVal. They're
set via:

void glViewport(GLint x,
 	            GLint y,
 	            GLsizei width,
 	            GLsizei height);

and

void glDepthRange(GLclampd nearVal, GLclampd farVal);  // both parameters are clamped to [0,1]


x, y, width, height specify the viewport (window) extent in the
viewport coordinate system. nearVal, farVal must lie within [0,1] and
specify the z location of near and far clipping planes. (...which
apparently determine which portion of the implementation's depth
buffer is used. nearVal==0, farVal==1 means full utilization of the
depth buffer. TODO: elaborate)

These two functions set the device -> window - transformation such
that normalized device coordinates

(xnd, ynd, znd)

are transformed into window coordinates

(xw, yw, zw) = ( (xnd+1)*w/2 + x, (ynd+1)*h/2 + y, nearVal + (farVal-nearVal)*(znd+1)/2 ) .

i.e.

device coords (xnd, ynd, znd)   ======>    window coords (xw, yw, zw)

([-1...1], [-1...1], [-1...1])  ======>    ([x...x+w], [y...y+h], [nearVal...farVal])

x,y will often be 0,0.

So, the viewport transformation essentially transforms the cube of
side length 2 around the origin in the device coordinate system to the
viewport which covers [x...x+w], [y...y+h] , [nearVal...farVal].


[1]

To be precise, device coordinates are not 3, but 4 coordinates
(xd,yd,zd,wd). These are converted to *normalized* device coordinates
(xnd,ynd,znd) := (xd/wd, yd/wd, zd/wd), and these go into the viewport
transformation. The wd will almost always be 1 (but not really always;
see glFrustum() below), so (xnd,ynd,znd)=(xd,yd,zd,wd).


Projection Matrix Specification
===============================

(beware -- the matrix stack has been deprecated in the core profile
(not in the compatibility profile). You should consider computing
complete object->eye transformation matrices in your client code and
then passing them to (vertex) shaders as needed, which might be easier
in some cases anyway. Still, this discussion may be helpful for a
better understanding of transformations in OpenGL)

As mentioned above, the projection matrix is used to transform eye
coordinates (i.e. coordinates relative to the viewer's eye) to device
coordinates (i.e. coordinates which, up to simple linear viewport
transformations, already correspond to viewport coordinates).

Thus, the projection matrix will do things like parallel (orthogonal)
or perspective projections.

To set up a parallel projection, you'd normally set the projection
matrix using the function

void glOrtho(	GLdouble  	left,
 	GLdouble  	right,
 	GLdouble  	bottom,
 	GLdouble  	top,
 	GLdouble  	nearVal,
 	GLdouble  	farVal) .

(you'd normally call glMatrixMode(GL_PROJECTION); glLoadIdentity();
first as glOrtho() actually multiplies the current (top-of-the-stack)
projection matrix with the one specified by the six parameters)

The projection matrix specified by this glOrtho call is (with
abbreviations l, r, b, t, n, f for the parameters):

      /                                         \
      | 2/(r-l)    0          0    -(r+l)/(r-l) |
      |                                         |
      | 0       2/(t-b)       0    -(t+b)/(t-b) |
P =   |                                         |
      | 0          0     -2/(f-n)  -(f+n)/(f-n) |
      |                                         |
      | 0          0          0          1      |
      \                                         /


..and as explained before, the device coordinates (xd,yd,zd) will then
be calculated from the eye coordinates (xe,ye,ze) as

(xd,yd,zd,1)^T = P * (xe,ye,ze,1)^T

This will essentially transform

eye coords (xe, ye, ze)        ======>    device coords (xd, yd, zd)

([l...r], [b...t], [-n...-f])  ======>    ([-1...1], [-1...1], [-1...1])

(for example, the x coordinate transformation (1st line of the matrix
multiplication):

intuitively: xd = -1 + 2 (xe-l)/(r-l)

             = (2(xe-l)-r+l) / (r-l) = 1st line

)


So, e.g. the point (l,b,-n) in eye coordinates will be transformed
into (-1,-1,-1) in device coordinates (and ultimately to (x,y,nearVal)
in the viewport, i.e. the lower-left corner of the viewport).

Essentially, glOrtho() specifies the rectangular x/y region that
defines the cross section of the area of the eye coordinate system
that will be visible in the viewport.

---

To set up a perspective projection, you'd normally set the projection
matrix using the function

void glFrustum(	GLdouble  	left,
 	GLdouble  	right,
 	GLdouble  	bottom,
 	GLdouble  	top,
 	GLdouble  	nearVal,
 	GLdouble  	farVal) .

(as with glOrtho above, you'd normally call glMatrixMode(GL_PROJECTION); glLoadIdentity();
first)

The projection matrix specified by this glFrustum call is (with
abbreviations l, r, b, t, n, f for the parameters):

      /                                           \
      | 2n/(r-l)   0      (r+l)/(r-l)      0      |
      |                                           |
      | 0       2n/(t-b)  (t+b)/(t-b)      0      |
P =   |                                           |
      | 0          0     -(f+n)/(f-n)  -2fn/(f-n) |
      |                                           |
      | 0          0         -1            0      |
      \                                           /


..and as explained before, the device coordinates (xd,yd,zd) will then
be calculated from the eye coordinates (xe,ye,ze) as

(xd,yd,zd,wd)^T = P * (xe,ye,ze,1)^T

In this case, wd will actually turn out to be unequal to 1, so the
normalized device coordinates (xnd,ynd,znd)=(xd/wd, yd/wd, zd/wd) will
be different from the non-normalized coordinates (xd,yd,zd). Also, the
wd won't be constant, but wd=-ze, such that non-constant components of
xnd and ynd will be anti-proportional to ze, which is what actually
produces the perspective projection.

In more detail:

(4th line of above matrix equation)

wd = -ze

(1st line of above matrix equation)

xd = xe 2 n / (r-l) + ze (r+l)/(r-l)

==>

xdn = xd/wd = - 2 n / (r-l) xe/ze + (r+l)/(r-l)

This will map e.g. (xe=l, ze=-n) to xdn=-1 and (xe = l f/n, ze=-f)
also to xdn=-1. xdn=-1 defines the left border of the
viewport. Analogously, the right border of the viewport (xdn=1) will
be mapped to xe=-l at ze=-n (near clipping plane) and to xe=-lf/n at
ze=-f (far clipping plane). Thus, all in all, this projection will
essentially set up the viewport to be a "window" that views, in the
eye coordinate space, a frustum that's defined in the (xe,ze) plane
like this:


                                                                                (i.e. right border of viewport)
                                                                            xdn=1
                  xe                                                   where   /--
                  X                                                  plane /---
                 /|\                                      eye coord.  /---
                  |                                             | /---
            rf/n -+---------------------------------------------+----
                  |                                       /---  |
                  |                                   /---      |
                  |                                /--          |
                  |                            /---             |
                  |                   |    /---                 |
                  |                   |/---                     |
               r -+-------------------+                         |
                  |             /---  |                         |
                  |         /---      |                         |
                  |     /---          |                         |
    /             | /---              |                         |
ze X -----viewer->--------------------+-------------------------+---------
    \             | \---             -n                        -f
                  |     \---          |                         |
                  |         \---      |                         |
                  |             \---  |                         |
               l -+-------------------+                         |
                  |                   |\---                     |
                  |                   |    \---                 |
                  |                   |        \---             |
                  |                                \--          |
                  |                                   \---      |
                  |                                       \---  |
            lf/n -+---------------------------------------------+-
                  |                                             | \---
                  |                                      eye coord.   \---
                  |                                             | plane   \---
                  |                                             |    where    \--
                                                                         xdn=-1
                                                                            (i.e. left border of viewport)


(the frustum is bordered by the xdn=-1 and xdn=1 planes as well as the
ze=-n and ze=-f planes)

As described above, xdn=-1 will be mapped by the glViewport function
to the left border of the viewport, and xdn=1 will be mapped to the
right border of the viewport.


Analogously, in the (ye,ze) plane, the frustum is defined like this:


                                                                                (i.e. top border of viewport)
                                                                            ydn=1
                  ye                                                   where   /--
                  X                                                  plane /---
                 /|\                                      eye coord.  /---
                  |                                             | /---
            tf/n -+---------------------------------------------+----
                  |                                       /---  |
                  |                                   /---      |
                  |                                /--          |
                  |                            /---             |
                  |                   |    /---                 |
                  |                   |/---                     |
               t -+-------------------+                         |
                  |             /---  |                         |
                  |         /---      |                         |
                  |     /---          |                         |
    /             | /---              |                         |
ze X -----viewer->--------------------+-------------------------+---------
    \             | \---             -n                        -f
                  |     \---          |                         |
                  |         \---      |                         |
                  |             \---  |                         |
               b -+-------------------+                         |
                  |                   |\---                     |
                  |                   |    \---                 |
                  |                   |        \---             |
                  |                                \--          |
                  |                                   \---      |
                  |                                       \---  |
            bf/n -+---------------------------------------------+-
                  |                                             | \---
                  |                                      eye coord.   \---
                  |                                             | plane   \---
                  |                                             |    where    \--
                                                                         ydn=-1
                                                                            (i.e. bottom border of viewport)


Matrix algebra, OGL matrix manipulation APIs
============================================

generally, with M1 and M2 matrices and x a point, the following holds:

M1*(M2*x) = (M1*M2)*x     [1]

..which means that if a transformation M1 is to be executed after a
transformation M2, M1 must be multiplied with M2 from the left to get
the combined transformation.

The Matrix operations in OGL (manipulate the top-of-the-matrix-stack
matrix of the currently set matrix mode) all multiply the new matrix
from the right:

glMultMatrix(M)

==> TopOfStack := TopOfStack * M   [2]

...resulting in a combined transformation that would first perform the
new transformation M and then the previous TopOfStack transformation.

glTranslate(), glRotate() etc. are all based on glMultMatrix().


conventions
===========

(maybe just mine)

coordinate systems:

- object coordinate system

  - system relative to a specific 3D object, regardless of where that
    object is located in the world coordinate system (see
    below). E.g. for a sphere, the origin of this system would
    normally always be in the center

  - the one in which vertices given to glVertex etc. live. You would
    not normally want to issue these in world coordinates because
    you'd then have to transform every vertex yourself rather than
    letting OGL do that

  - TODO: sub-objects

- world coordinate system

  - as the name suggests. Non-moving objects have time-constant
    coordinates here


- eye coordinate system

  - see beginning of this document. coordinates relative to the
    eye/camera location and viewing direction. So, if the viewer
    changes his viewing direction, the vertices of non-moving (in
    world coordinates) objects would all change accordingly


When drawing objects (glVertex() etc.), the complete object->eye
coordinate transformation should be loaded in the GL_MODELVIEW
matrix. This one can be calculated as:

T_{object->eye} = T_{world->eye} * T_{object->world}

Because of [1], this means that we first convert object->world and
then world->eye, which is correct.

For programming this, because of [2], you would normally first push
T_{world->eye} (i.e. "camera position transformation") onto the
GL_MODELVIEW stack, and then, for each object, push the
T_{object->world}, draw the object, and pop the last matrix again.