Sunday, January 31, 2010


Overview



OpenGL PBO
OpenGL ARB_pixel_buffer_object extension is very close toARB_vertex_buffer_object. It simply expands ARB_vertex_buffer_object extension in order to store not only vertex data but also pixel data into the buffer objects. This buffer object storing pixel data is called Pixel Buffer Object (PBO). ARB_pixel_buffer_object extension borrows all VBO framework and APIs, plus, adds 2 additional "target" tokens. These tokens assist the PBO memory manger (OpenGL driver) to determine the best location of the buffer object; system memory, AGP (shared memory) or video memory. Also, the target tokens clearly specify the bound PBO will be used in one of 2 different operations; GL_PIXEL_PACK_BUFFER_ARB to transfer pixel data to a PBO, or GL_PIXEL_UNPACK_BUFFER_ARB to transfer pixel data from PBO.
For example, glReadPixels() and glGetTexImage() are "pack" pixel operations, and glDrawPixels(), glTexImage2D() and glTexSubImage2D() are "unpack" operations. When a PBO is bound with GL_PIXEL_PACK_BUFFER_ARB token, glReadPixels() reads pixel data from a OpenGL framebuffer and write (pack) the data into the PBO. When a PBO is bound with GL_PIXEL_UNPACK_BUFFER_ARB token, glDrawPixels() reads (unpack) pixel data from the PBO and copy them to OpenGL framebuffer.
The main advantage of PBO are fast pixel data transfer to and from a graphics card through DMA (Direct Memory Access) without involing CPU cycles. And, the other advantage of PBO is asynchronous DMA transfer. Let's compare a conventional texture transfer method with using a Pixel Buffer Object. The left side of the following diagram is a conventional way to load texture data from an image source (image file or video stream). The source is first loaded into system memory, and then, copied from system memory to an OpenGL texture object with glTexImage2D(). These 2 transfer processes (load and copy) are all performed by CPU.

 
Texture loading without PBO


Texture loading with PBO
On the contrary in the right side diagram, the image source can be directly loaded into a PBO, which is controlled by OpenGL. CPU still involves to load the source to the PBO, but, not for transferring the pixel data from a PBO to a texture object. Instead, GPU (OpenGL driver) manages copying data from a PBO to a texture object. This means OpenGL performs a DMA transfer operation without wasting CPU cycles. Further, OpenGL can schedule an asynchronous DMA transfer for later execution. Therefore, glTexImage2D() returns immediately, and CPU can perform something else without waiting the pixel transfer is done.
There are 2 major PBO approaches to improve the performance of the pixel data transfer: streaming texture update and asynchronous read-back from the framebuffer.

Friday, January 29, 2010


In OpenGL rendering pipeline, the geometry data and textures are transformed and passed several tests, and then finally rendered onto a screen as 2D pixels. The final rendering destination of the OpenGL pipeline is called framebuffer. Framebuffer is a collection of 2D arrays or storages utilized by OpenGL; colour buffers, depth buffer, stencil buffer and accumulation buffer. By default, OpenGL uses the framebuffer as a rendering destination that is created and managed entirely by the window system. This default framebuffer is called window-system-provided framebuffer.
The OpenGL extension, GL_EXT_framebuffer_object provides an interface to create additional non-displayable framebuffer objects (FBO). This framebuffer is called application-created framebuffer in order to distinguish from the default window-system-provided framebuffer. By using framebuffer object (FBO), an OpenGL application can redirect the rendering output to the application-created framebuffer object (FBO) other than the traditional window-system-provided framebuffer. And, it is fully controlled by OpenGL.
Similar to window-system-provided framebuffer, a FBO contains a collection of rendering destinations; color, depth and stencil buffer. (Note that accumulation buffer is not defined in FBO.) These logical buffers in a FBO are called framebuffer-attachable images, which are 2D arrays of pixels that can be attached to a framebuffer object.
There are two types of framebuffer-attachable images; texture images and renderbuffer images. If an image of a texture object is attached to a framebuffer, OpenGL performs "render to texture". And if an image of a renderbuffer object is attached to a framebuffer, then OpenGL performs "offscreen rendering".
By the way, renderbuffer object is a new type of storage object defined in GL_EXT_framebuffer_object extension. It is used as a rendering destination for a single 2D image during rendering process.
OpenGL Frame Buffer Object (FBO)
Connectivity among FBO, texture and Renderbuffer
The following diagram shows the connectivity among the framebuffer object, texture object and renderbuffer object. Multiple texture objects or renderbuffer objects can be attached to a framebuffer object through the attachment points.
There are multiple color attachment points (GL_COLOR_ATTACHMENT0_EXT,..., GL_COLOR_ATTACHMENTn_EXT), one depth attachment point (GL_DEPTH_ATTACHMENT_EXT), and one stencil attachment point (GL_STENCIL_ATTACHMENT_EXT) in a framebuffer object. The number of color attachment points is implementation dependent, but each FBO must have at least one color attachement point. You can query the maximum number of color attachement points with GL_MAX_COLOR_ATTACHMENTS_EXT, which are supported by a graphics card. The reason that a FBO has multiple color attachement points is to allow to render the color buffer to multiple destinations at the same time. This "multiple render targets"(MRT) can be accomplished by GL_ARB_draw_buffers extension. Notice that the framebuffer object itself does not have any image storage(array) in it, but, it has only multiple attachment points.
Framebuffer object (FBO) provides an efficient switching mechanism; detach the previous framebuffer-attachable image from a FBO, and attach a new framebuffer-attachable image to the FBO. Switching framebuffer-attachable images is much faster than switching between FBOs. FBO providesglFramebufferTexture2DEXT() to switch 2D texture objects, and glFramebufferRenderbufferEXT()to switch renderbuffer objects.

Creating Frame Buffer Object (FBO)

Creating framebuffer objects is similar to generating vertex buffer objects (VBO).

glGenFramebuffersEXT()

void glGenFramebuffersEXT(GLsizei n, GLuint* ids)
void glDeleteFramebuffersEXT(GLsizei n, const GLuint* ids)
glGenFramebuffersEXT() requires 2 parameters; the first one is the number of framebuffers to create, and the second parameter is the pointer to a GLuint variable or an array to store a single ID or multiple IDs. It returns the IDs of unused framebuffer objects. ID 0 means the default framebuffer, which is the window-system-provided framebuffer.
And, FBO may be deleted by calling glDeleteFramebuffersEXT() when it is not used anymore.

glBindFramebufferEXT()

Once a FBO is created, it has to be bound before using it.
void glBindFramebufferEXT(GLenum target, GLuint id)
The first parameter, target, should be GL_FRAMEBUFFER_EXT, and the second parameter is the ID of a framebuffer object. Once a FBO is bound, all OpenGL operations affect onto the current bound framebuffer object. The object ID 0 is reserved for the default window-system provided framebuffer. Therefore, in order to unbind the current framebuffer (FBO), use ID 0 in glBindFramebufferEXT().

Renderbuffer Object

In addition, renderbuffer object is newly introduced for offscreen rendering. It allows to render a scene directly to a renderbuffer object, instead of rendering to a texture object. Renderbuffer is simply a data storage object containing a single image of a renderable internal format. It is used to store OpenGL logical buffers that do not have corresponding texture format, such as stencil or depth buffer.

glGenRenderbuffersEXT()

void glGenRenderbuffersEXT(GLsizei n, GLuint* ids)
void glDeleteRenderbuffersEXT(GLsizei n, const Gluint* ids)
Once a renderbuffer is created, it returns non-zero positive integer. ID 0 is reserved for OpenGL.

glBindRenderbufferEXT()

void glBindRenderbufferEXT(GLenum target, GLuint id)
Same as other OpenGL objects, you have to bind the current renderbuffer object before referencing it. The target parameter should be GL_RENDERBUFFER_EXT for renderbuffer object.

glRenderbufferStorageEXT()

void glRenderbufferStorageEXT(GLenum target, GLenum internalFormat,
                              GLsizei width, GLsizei height)
When a renderbuffer object is created, it does not have any data storage, so we have to allocate a memory space for it. This can be done by using glRenderbufferStorageEXT(). The first parameter must be GL_RENDERBUFFER_EXT. The second parameter would be color-renderable (GL_RGB, GL_RGBA, etc.), depth-renderable (GL_DEPTH_COMPONENT), or stencil-renderable formats (GL_STENCIL_INDEX). The width and height are the dimension of the renderbuffer image in pixels.
The width and height should be less than GL_MAX_RENDERBUFFER_SIZE_EXT, otherwise, it generates GL_INVALID_VALUE error.

glGetRenderbufferParameterivEXT()

void glGetRenderbufferParameterivEXT(GLenum target, GLenum param, GLint* value);
You also get various parameters of the currently bound renderbuffer object. target should be GL_RENDERBUFFER_EXT, and the second parameter is the name of parameter. The last is the pointer to an integer variable to store the returned value. The available names of the renderbuffer parameters are;
GL_RENDERBUFFER_WIDTH_EXT
GL_RENDERBUFFER_HEIGHT_EXT
GL_RENDERBUFFER_INTERNAL_FORMAT_EXT
GL_RENDERBUFFER_RED_SIZE_EXT
GL_RENDERBUFFER_GREEN_SIZE_EXT
GL_RENDERBUFFER_BLUE_SIZE_EXT
GL_RENDERBUFFER_ALPHA_SIZE_EXT
GL_RENDERBUFFER_DEPTH_SIZE_EXT
GL_RENDERBUFFER_STENCIL_SIZE_EXT

Attaching images to FBO

FBO itself does not have any image storage(buffer) in it. Instead, we must attach framebuffer-attachable images (texture or renderbuffer objects) to the FBO. This mechanism allows that FBO quickly switch (detach and attach) the framebuffer-attachable images in a FBO. It is much faster to switch framebuffer-attachable images than to switch between FBOs. And, it saves unnecessary data copies and memory consumption. For example, a texture can be attached to multiple FBOs, and its image storage can be shared by multiple FBOs.

Attaching a 2D texture image to FBO

glFramebufferTexture2DEXT(GLenum target,
                          GLenum attachmentPoint,
                          GLenum textureTarget,
                          GLuint textureId,
                          GLint  level)
glFramebufferTexture2DEXT() is to attach a 2D texture image to a FBO. The first parameter must be GL_FRAMEBUFFER_EXT, and the second parameter is the attachment point where to connect the texture image. A FBO has multiple color attachment points (GL_COLOR_ATTACHMENT0_EXT, ..., GL_COLOR_ATTACHMENTn_EXT), GL_DEPTH_ATTACHMENT_EXT, and GL_STENCIL_ATTACHMENT_EXT. The third parameter, "textureTarget" is GL_TEXTURE_2D in most cases. The fourth parameter is the identifier of the texture object. The last parameter is the mipmap level of the texture to be attached.
If the textureId parameter is set to 0, then, the texture image will be detached from the FBO. If a texture object is deleted while it is still attached to a FBO, then, the texture image will be automatically detached from the currently bound FBO. However, if it is attached to multiple FBOs and deleted, then it will be detached from only the bound FBO, but will not be detached from any other un-bound FBOs.

Attaching a Renderbuffer image to FBO

void glFramebufferRenderbufferEXT(GLenum target,
                                  GLenum attachmentPoint,
                                  GLenum renderbufferTarget,
                                  GLuint renderbufferId)
A renderbuffer image can be attached by calling glFramebufferRenderbufferEXT(). The first and second parameters are same as glFramebufferTexture2DEXT(). The third parameter must be GL_RENDERBUFFER_EXT, and the last parameter is the ID of the renderbuffer object.
If renderbufferId parameter is set to 0, the renderbuffer image will be detached from the attachment point in the FBO. If a renderbuffer object is deleted while it is still attached in a FBO, then it will be automatically detached from the bound FBO. However, it will not be detached from any other non-bound FBOs.

Checking FBO Status
Once attachable images (textures and renderbuffers) are attached to a FBO and before performing FBO operation, you must validate if the FBO status is complete or incomplete by usingglCheckFramebufferStatusEXT(). If the FBO is not complete, then any drawing and reading command (glBegin(), glCopyTexImage2D(), etc) will be failed.
GLenum glCheckFramebufferStatusEXT(GLenum target)
glCheckFramebufferStatusEXT() validates all its attached images and framebuffer parameters on the currently bound FBO. And, this function cannot be called within glBegin()/glEnd() pair. The target parameter should be GL_FRAMEBUFFER_EXT. It returns non-zero value after checking the FBO. If all requirements and rules are satisfied, then it returns GL_FRAMEBUFFER_COMPLETE_EXT. Otherwise, it returns a relevant error value, which tells what rule is violated.
The rules of FBO completeness are:
  • The width and height of framebuffer-attachable image must be not zero.
  • If an image is attached to a color attachment point, then the image must have a color-renderable internal format. (GL_RGBA, GL_DEPTH_COMPONENT, GL_LUMINANCE, etc)
  • If an image is attached to GL_DEPTH_ATTACHMENT_EXT, then the image must have a depth-renderable internal format. (GL_DEPTH_COMPONENT, GL_DEPTH_COMPONENT24_EXT, etc)
  • If an image is attached to GL_STENCIL_ATTACHMENT_EXT, then the image must have a stencil-renderable internal format. (GL_STENCIL_INDEX, GL_STENCIL_INDEX8_EXT, etc)
  • FBO must have at least one image attached.
  • All images attached a FBO must have the same width and height.
  • All images attached the color attachment points must have the same internal format.
Note that even though all of the above conditions are satisfied, your OpenGL driver may not support some combinations of internal formats and parameters. If a particular implementation is not supported by OpenGL driver, then glCheckFramebufferStatusEXT() returns GL_FRAMEBUFFER_UNSUPPORTED_EXT.

OpenGL Vertex Buffer Object (VBO)


GL_ARB_vertex_buffer_object extension is intended to enhance the performance of OpenGL by providing the benefits of vertex array and display list, while avoiding downsides of their implementations. Vertex buffer object (VBO) allows vertex array data to be stored in high-performance graphics memory on the server side and promotes efficient data transfer. If the buffer object is used to store pixel data, it is called Pixel Buffer Object (PBO).
Using vertex array can reduce the number of function calls and redundant usage of the shared vertices. However, the disadvantage of vertex array is that vertex array functions are in the client state and the data in the arrays must be re-sent to the server each time when it is referenced.
On the other hand, display list is server side function, so it does not suffer from overhead of data transfer. But, once a display list is compiled, the data in the display list cannot be modified.
Vertex buffer object (VBO) creates "buffer objects" for vertex attributes in high-performance memory on the server side and provides same access functions to reference the arrays, which are used in vertex arrays, such as glVertexPointer(), glNormalPointer(), glTexCoordPointer(), etc.
The memory manager in vertex buffer object will put the buffer objects into the best place of memory based on user's hints: "target" and "usage" mode. Therefore, the memory manager can optimize the buffers by balancing between 3 kinds of memory: system, AGP and video memory.
Unlike display lists, the data in vertex buffer object can be read and updated by mapping the buffer into client's memory space.
Another important advantage of VBO is sharing the buffer objects with many clients, like display lists and textures. Since VBO is on the server's side, multiple clients will be able to access the same buffer with the corresponding identifier.

Creating VBO

Creating a VBO requires 3 steps;
  1. Generate a new buffer object with glGenBuffersARB().
  2. Bind the buffer object with glBindBufferARB().
  3. Copy vertex data to the buffer object with glBufferDataARB().

glGenBuffersARB()

glGenBuffersARB() creates buffer objects and returns the identifiers of the buffer objects. It requires 2 parameters: the first one is the number of buffer objects to create, and the second parameter is the address of a GLuint variable or array to store a single ID or multiple IDs.
void glGenBuffersARB(GLsizei n, GLuint* ids)

glBindBufferARB()

Once the buffer object has been created, we need to hook the buffer object with the corresponding ID before using the buffer object. glBindBufferARB() takes 2 parameters: target and ID.
void glBindBufferARB(GLenum target, GLuint id)
Target is a hint to tell VBO whether this buffer object will store vertex array data or index array data: GL_ARRAY_BUFFER_ARB, or GL_ELEMENT_ARRAY_BUFFER_ARB. Any vertex attributes, such as vertex coordinates, texture coordinates, normals and color component arrays should use GL_ARRAY_BUFFER_ARB. Index array which is used for glDraw[Range]Elements() should be tied with GL_ELEMENT_ARRAY_BUFFER_ARB. Note that this target flag assists VBO to decide the most efficient locations of buffer objects, for example, some systems may prefer indices in AGP or system memory, and vertices in video memory.
Once glBindBufferARB() is first called, VBO initializes the buffer with a zero-sized memory buffer and set the initial VBO states, such as usage and access properties.

glBufferDataARB()

You can copy the data into the buffer object with glBufferDataARB() when the buffer has been initialized.
void glBufferDataARB(GLenum target, GLsizei size, const void* data, GLenum usage)
Again, the first parameter, target would be GL_ARRAY_BUFFER_ARB or GL_ELEMENT_ARRAY_BUFFER_ARB. Size is the number of bytes of data to transfer. The third parameter is the pointer to the array of source data. If data is NULL pointer, then VBO reserves only memory space with the given data size. The last parameter, "usage" flag is another performance hint for VBO to provide how the buffer object is going to be used: staticdynamic or stream, and read,copy or draw.
VBO specifies 9 enumerated values for usage flags;
GL_STATIC_DRAW_ARB
GL_STATIC_READ_ARB
GL_STATIC_COPY_ARB
GL_DYNAMIC_DRAW_ARB
GL_DYNAMIC_READ_ARB
GL_DYNAMIC_COPY_ARB
GL_STREAM_DRAW_ARB
GL_STREAM_READ_ARB
GL_STREAM_COPY_ARB
"Static" means the data in VBO will not be changed (specified once and used many times),"dynamic" means the data will be changed frequently (specified and used repeatedly), and "stream"means the data will be changed every frame (specified once and used once). "Draw" means the data will be sent to GPU in order to draw (application to GL), "read" means the data will be read by the client's application (GL to application), and "copy" means the data will be used both drawing and reading (GL to GL).
Note that only draw token is useful for VBO, and copy and read token will be become meaningful only for pixel/frame buffer object (PBO or FBO).
VBO memory manager will choose the best memory places for the buffer object based on these usage flags, for example, GL_STATIC_DRAW_ARB and GL_STREAM_DRAW_ARB may use video memory, and GL_DYNAMIC_DRAW_ARB may use AGP memory. Any _READ_ related buffers would be fine in system or AGP memory because the data should be easy to access.

glBufferSubDataARB()

void glBufferSubDataARB(GLenum target, GLint offset, GLsizei size, void* data)
Like glBufferDataARB(), glBufferSubDataARB() is used to copy data into VBO, but it only replaces a range of data into the existing buffer, starting from the given offset. (The total size of the buffer must be set by glBufferDataARB() before using glBufferSubDataARB().)

glDeleteBuffersARB()

void glDeleteBuffersARB(GLsizei n, const GLuint* ids)
You can delete a single VBO or multiple VBOs with glDeleteBuffersARB() if they are not used anymore. After a buffer object is deleted, its contents will be lost.
The following code is an example of creating a single VBO for vertex coordinates. Notice that you can delete the memory allocation for vertex array in your application after you copy data into VBO.
GLuint vboId;                              // ID of VBO
GLfloat* vertices = new GLfloat[vCount*3]; // create vertex array
...

// generate a new VBO and get the associated ID
glGenBuffersARB(1, &vboId);

// bind VBO in order to use
glBindBufferARB(GL_ARRAY_BUFFER_ARB, vboId);

// upload data to VBO
glBufferDataARB(GL_ARRAY_BUFFER_ARB, dataSize, vertices, GL_STATIC_DRAW_ARB);

// it is safe to delete after copying data to VBO
delete [] vertices;
...

// delete VBO when program terminated
glDeleteBuffersARB(1, &vboId);

Drawing VBO

Because VBO sits on top of the existing vertex array implementation, rendering VBO is almost same as using vertex array. Only difference is that the pointer to the vertex array is now as an offset into a currently bound buffer object. Therefore, no additional APIs are required to draw a VBO except glBindBufferARB().
// bind VBOs for vertex array and index array
glBindBufferARB(GL_ARRAY_BUFFER_ARB, vboId1);         // for vertex coordinates
glBindBufferARB(GL_ELEMENT_ARRAY_BUFFER_ARB, vboId2); // for indices

// do same as vertex array except pointer
glEnableClientState(GL_VERTEX_ARRAY);             // activate vertex coords array
glVertexPointer(3, GL_FLOAT, 0, 0);               // last param is offset, not ptr

// draw 6 quads using offset of index array
glDrawElements(GL_QUADS, 24, GL_UNSIGNED_BYTE, 0);

glDisableClientState(GL_VERTEX_ARRAY);            // deactivate vertex array

// bind with 0, so, switch back to normal pointer operation
glBindBufferARB(GL_ARRAY_BUFFER_ARB, 0);
glBindBufferARB(GL_ELEMENT_ARRAY_BUFFER_ARB, 0);
Binding the buffer object with 0 switchs off VBO operation. It is a good idea to turn VBO off after use, so normal vertex array operations with absolute pointers will be re-activated.

Updating VBO

The advantage of VBO over display list is the client can read and modify the buffer object data, but display list cannot. The simplest method of updating VBO is copying again new data into the bound VBO with glBufferDataARB() or glBufferSubDataARB(). For this case, your application should have a valid vertex array all the time in your application. That means that you must always have 2 copies of vertex data: one in your application and the other in VBO.
The other way to modify buffer object is to map the buffer object into client's memory, and the client can update data with the pointer to the mapped buffer. The following describes how to map VBO into client's memory and how to access the mapped data.

glMapBufferARB()

VBO provides glMapBufferARB() in order to map the buffer object into client's memory.
void* glMapBufferARB(GLenum target, GLenum access)
If OpenGL is able to map the buffer object into client's address space, glMapBufferARB() returns the pointer to the buffer. Otherwise it returns NULL.
The first parameter, target is mentioned earlier at glBindBufferARB() and the second parameter,access flag specifies what to do with the mapped data: read, write or both.
GL_READ_ONLY_ARB
GL_WRITE_ONLY_ARB
GL_READ_WRITE_ARB
Note that glMapBufferARB() causes a synchronizing issue. If GPU is still working with the buffer object, glMapBufferARB() will not return until GPU finishes its job with the corresponding buffer object.
To avoid waiting (idle), you can call first glBufferDataARB() with NULL pointer, then call glMapBufferARB(). In this case, the previous data will be discarded and glMapBufferARB() returns a new allocated pointer immediately even if GPU is still working with the previous data.
However, this method is valid only if you want to update entire data set because you discard the previous data. If you want to change only portion of data or to read data, you better not release the previous data.

glUnmapBufferARB()

GLboolean glUnmapBufferARB(GLenum target)
After modifying the data of VBO, it must be unmapped the buffer object from the client's memory. glUnmapBufferARB() returns GL_TRUE if success. When it returns GL_FALSE, the contents of VBO become corrupted while the buffer was mapped. The corruption results from screen resolution change or window system specific events. In this case, the data must be resubmitted.
Here is a sample code to modify VBO with mapping method.
// bind then map the VBO
glBindBufferARB(GL_ARRAY_BUFFER_ARB, vboId);
float* ptr = (float*)glMapBufferARB(GL_ARRAY_BUFFER_ARB, GL_WRITE_ONLY_ARB);

// if the pointer is valid(mapped), update VBO
if(ptr)
{
    updateMyVBO(ptr, ...);                 // modify buffer data
    glUnmapBufferARB(GL_ARRAY_BUFFER_ARB); // unmap it after use
}

// you can draw the updated VBO
...

OpenGL Transformation


Geometric data such as vertex positions and normal vectors are transformed via Vertex Operation andPrimitive Assembly operation in OpenGL pipeline before raterization process.
OpenGL vertex transformation
OpenGL vertex transformation

Object Coordinates

It is the local coordinate system of objects and is initial position and orientation of objects before any transform is applied. In order to transform objects, use glRotatef(), glTranslatef(), glScalef().

Eye Coordinates

It is yielded by multiplying GL_MODELVIEW matrix and object coordinates. Objects are transformed from object space to eye space using GL_MODELVIEW matrix in OpenGL. GL_MODELVIEWmatrix is a combination of Model and View matrices (Mview * Mmodel). Model transform is to convert from object space to world space. And, View transform is to convert from world space to eye space.
OpenGL eye coordinates
Note that there is no separate camera (view) matrix in OpenGL. Therefore, in order to simulate transforming the camera or view, the scene (3D objects and lights) must be transformed with the inverse of the view transformation. In other words, OpenGL defines that the camera is always located at (0, 0, 0) and facing to -Z axis in the eye space coordinates, and cannot be transformed. See more details of GL_MODELVIEW matrix in ModelView Matrix.
Normal vectors are also transformed from object coordinates to eye coordinates for lighting calculation. Note that normals are transformed in different way as vertices do. It is mutiplying the tranpose of the inverse of GL_MODELVIEW matrix by a normal vector. See more details in Normal Vector Transformation.
normal vector transformation

Clip Coordinates

It is after applying eye coordinates into GL_PROJECTION matrix. Objects are clipped out from the viewing volume (frustum). Frustum is used to determine how objects are projected onto screen (perspective or orthogonal) and which objects or portions of objects are clipped out of the final image. See more details of GL_PROJECTION matrix in Projection Matrix.
OpenGL clip coordinates

Normalized Device Coordinates (NDC)

It is yielded by dividing the clip coordinates by w. It is called perspective division. It is more like window (screen) coordinates, but has not been translated and scaled to screen pixels yet. The range of values is now normalized from -1 to 1 in all 3 axes.
OpenGL Normalized Device Coordinates

Window Coordinates (Screen Coordinates)

It is yielded by applying normalized device coordinates (NDC) to viewport transformation. The NDC are scaled and translated in order to fit into the rendering screen. The window coordinates finally are passed to the raterization process of OpenGL pipeline to become a fragment. glViewport()command is used to define the rectangle of the rendering area where the final image is mapped. And, glDepthRange() is used to determine the z value of the window coordinates. The window coordinates are computed with the given parameters of the above 2 functions;
glViewport(x, y, w, h);
glDepthRange(n, f);
OpenGL Window Coordinates
The viewport transform formula is simply acquired by the linear relationship between NDC and the window coordinates;


OpenGL Transformation Matrix
OpenGL Transform Matrix
OpenGL Transform Matrix
OpenGL uses 4 x 4 matrix for transformations. Notice that 16 elements in the matrix are stored as 1D array in column-major order. You need to transpose this matrix if you want to convert it to the standard convention, row-major format.
OpenGL has 4 different types of matrices; GL_MODELVIEW,GL_PROJECTIONGL_TEXTURE, and GL_COLOR. You can switch the current type by using glMatrixMode() in your code. For example, in order to select GL_MODELVIEW matrix, use glMatrixMode(GL_MODELVIEW).

Model-View Matrix (GL_MODELVIEW)

GL_MODELVIEW matrix combines viewing matrix and modeling matrix into one matrix. In order to transform the view (camera), you need to move whole scene with the inverse transformation.gluLookAt() is particularly used to set viewing transform.
Columns of OpenGL ModelView matrix
4 columns of GL_MODELVIEW matrix
The 3 matrix elements of the rightmost column (m12,m13m14) are for the translation transformation,glTranslatef(). The element m15 is the homogeneous coordinate. It is specially used for projective transformation.
3 elements sets, (m0m1m2), (m4m5m6) and (m8,m9m10) are for Euclidean and affine transformation, such as rotation glRotatef() or scaling glScalef(). Note that these 3 sets are actually representing 3 orthogonal axes;
  • (m0m1m2)   : +X axis, left vector, (1, 0, 0) by default
  • (m4m5m6)   : +Y axis, up vector, (0, 1, 0) by default
  • (m8m9m10) : +Z axis, forward vector, (0, 0, 1) by default
We can directly construct GL_MODELVIEW matrix from angles or lookat vector without using OpenGL transform functions. Here are some useful codes to build GL_MODELVIEW matrix:
  • Angles to Axes
  • Lookat to Axes
Note that OpenGL performs matrices multiplications in reverse order if multiple transforms are applied to a vertex. For example, If a vertex is transformed by MA first, and transformed by MBsecond, then OpenGL performs MB x MA first before multiplying the vertex. So, the last transform comes first and the first transform occurs last in your code.
// Note that the object will be translated first then rotated
glRotatef(angle, 1, 0, 0);   // rotate object angle degree around X-axis
glTranslatef(x, y, z);       // move object to (x, y, z)
drawObject();

Projection Matrix (GL_PROJECTION)

GL_PROJECTION matrix is used to define the frustum. This frustum determines which objects or portions of objects will be clipped out. Also, it determines how the 3D scene is projected onto the screen. (Please see more details how to construct the projection matrix.)
OpenGL provides 2 functions for GL_PROJECTION transformation. glFrustum() is to produce a perspective projection, and glOrtho() is to produce a orthographic (parallel) projection. Both functions require 6 parameters to specify 6 clipping planes; leftrightbottomtopnear and farplanes. 8 vertices of the viewing frustum are shown in the following image.
OpenGL Perspective Frustum
OpenGL Perspective Viewing Frustum
The vertices of the far (back) plane can be simply calculated by the ratio of similar triangles, for example, the left of the far plane is;
OpenGL Orthographic Frustum
OpenGL Orthographic Frustum
For orthographic projection, this ratio will be 1, so the leftright,bottom and top values of the far plane will be same as on the near plane.
You may also use gluPerspective() and gluOrtho2D() functions with less number of parameters. gluPerspective() requires only 4 parameters; vertical field of view (FOV), the aspect ratio of width to height and the distances to near and far clipping planes. The equivalent conversion from gluPerspective() to glFrustum() is described in the following code.
// This creates a symmetric frustum.
// It converts to 6 params (l, r, b, t, n, f) for glFrustum()
// from given 4 params (fovy, aspect, near, far)
void makeFrustum(double fovY, double aspectRatio, double front, double back)
{
    const double DEG2RAD = 3.14159265 / 180;

    double tangent = tan(fovY/2 * DEG2RAD);   // tangent of half fovY
    double height = front * tangent;          // half height of near plane
    double width = height * aspectRatio;      // half width of near plane

    // params: left, right, bottom, top, near, far
    glFrustum(-width, width, -height, height, front, back);
}

An example of an asymmetric frustum
However, you have to use glFrustum() directly if you need to create a non-symmetrical viewing volume. For example, if you want to render a wide scene into 2 adjoining screens, you can break down the frustum into 2 asymmetric frustums (left and right). Then, render the scene with each frustum.

Texture Matrix (GL_TEXTURE)

Texture coordinates (strq) are multiplied by GL_TEXTURE matrix before any texture mapping. By default it is the identity, so texture will be mapped to objects exactly where you assigned the texture coordinates. By modifying GL_TEXTURE, you can slide, rotate, stretch, and shrink the texture.
// rotate texture around X-axis
glMatrixMode(GL_TEXTURE);
glRotatef(angle, 1, 0, 0);

Color Matrix (GL_COLOR)

The color components (rgba) are multiplied by GL_COLOR matrix. It can be used for color space conversion and color component swaping. GL_COLOR matrix is not commonly used and is requiredGL_ARB_imaging extension.

Other Matrix Routines

glPushMatrix() : 
push the current matrix into the current matrix stack.
glPopMatrix() : 
pop the current matrix from the current matrix stack.
glLoadIdentity() : 
set the current matrix to the identity matrix.
glLoadMatrix{fd}(m) : 
replace the current matrix with the matrix m.
glLoadTransposeMatrix{fd}(m) : 
replace the current matrix with the row-major ordered matrix m.
glMultMatrix{fd}(m) : 
multiply the current matrix by the matrix m, and update the result to the current matrix.
glMultTransposeMatrix{fd}(m) : 
multiply the current matrix by the row-major ordered matrix m, and update the result to the current matrix.
glGetFloatv(GL_MODELVIEW_MATRIX, m) : 
return 16 values of GL_MODELVIEW matrix to m

Thursday, January 28, 2010

RGB to Grayscale

RGB to Grayscale

In order to convert RGB or BGR color image to grayscale image, we are frequently use the following conversion formulae:
Luminace = 0.3086 * Red + 0.6094 * Green + 0.0820 * Blue
Luminace = 0.299 * Red + 0.587 * Green + 0.114 * Blue



Notice that the luminance intensity is a sum of different weight of each color component. If we use same weight, for example, (R + G + B) / 3, then pure red, pure green and pure blue result in same gray scale level. And the other reason for using different weights is that human eye is more sensitive on green and red components than blue channel.

Fast Approximation

Implementing the conversion formula is quite simple in C/C++;
// interleaved color, RGBRGB...
for ( i = 0; i < size; i += 3 )
{
    // add weighted sum then round
    out[i] = (unsigned char)(0.299*in[i] + 0.587*in[i+1] + 0.114*in[i+2] + 0.5);
}
Since multiplication in integer domain is much faster than in float, we rewrite the formula like this for faster computation;
Luminance = (2 * Red + 5 * Green + 1 * Blue) / 8
Furthermore, we can optimize the code by substituting the multiplication and division into bit shift operators and addition. I have found this is about 4 times faster than the original float computation.
int tmp;
for ( i = 0; i < size; i += 3 )
{
    // faster computation, (2*R + 5*G + B) / 8
    tmp = in[i] << 1;                   // 2 * red
    tmp += in[i+1] << 2 + in[i+1];      // 5 * green
    tmp += in[i+2];                     // 1 * blue
    out[i] = (unsigned char)(tmp >> 3); // divide by 8
}

OpenGL Pipeline

OpenGL Pipeline has a series of processing stages in order. Two graphical information, vertex-based data and pixel-based data, are processed through the pipeline, combined together then written into the frame buffer. Notice that OpenGL can send the processed data back to your application. (See the grey colour lines)
OpenGL Pipeline
OpenGL Pipeline

Display List

Display list is a group of OpenGL commands that have been stored (compiled) for later execution. All data, geometry (vertex) and pixel data, can be stored in a display list. It may improve performance since commands and data are cached in a display list. When OpenGL program runs on the network, you can reduce data transmission over the network by using display list. Since display lists are part of server state and reside on the server machine, the client machine needs to send commands and data only once to server's display list.
 

Vertex Operation

Each vertex and normal coordinates are transformed by GL_MODELVIEW matrix (from object coordinates to eye coordinates). Also, if lighting is enabled, the lighting calculation per vertex is performed using the transformed vertex and normal data. This lighting calculation updates new color of the vertex.
 

Primitive Assembly

After vertex operation, the primitives (point, line, and polygon) are transformed once again by projection matrix then clipped by viewing volume clipping planes; from eye coordinates to clip coordinates. After that, perspective division by w occurs and viewport transform is applied in order to map 3D scene to window space coordinates. Last thing to do in Primitive Assembly is culling test if culling is enabled.
 

Pixel Transfer Operation

After the pixels from client's memory are unpacked(read), the data are performed scaling, bias, mapping and clamping. These operations are called Pixel Transfer Operation. The transferred data are either stored in texture memory or rasterized directly to fragments.
 

Texture Memory

Texture images are loaded into texture memory to be applied onto geometric objects.
 

Raterization

Rasterization is the conversion of both geometric and pixel data into fragment. Fragments are a rectangular array containing color, depth, line width, point size and antialiasing calculations (GL_POINT_SMOOTH, GL_LINE_SMOOTH, GL_POLYGON_SMOOTH). If shading mode is GL_FILL, then the interior pixels (area) of polygon will be filled at this stage. Each fragment corresponds to a pixel in the frame buffer.
 

Fragment Operation

It is the last process to convert fragments to pixels onto frame buffer. The first process in this stage is texel generation; A texture element is generated from texture memory and it is applied to the each fragment. Then fog calculations are applied. After that, there are several fragment tests follow in order; Scissor Test ⇒ Alpha Test ⇒ Stencil Test ⇒ Depth Test.
Finally, blending, dithering, logical operation and masking by bitmask are performed and actual pixel data are stored in frame buffer.
 

Feedback

OpenGL can return most of current states and information through glGet*() and glIsEnabled() commands. Further more, you can read a rectangular area of pixel data from frame buffer using glReadPixels(), and get fully transformed vertex data using glRenderMode(GL_FEEDBACK). glCopyPixels() does not return pixel data to the specified system memory, but copy them back to the another frame buffer, for example, from front buffer to back buffer.

OpenGL Introduction

OpenGL Introduction

OpenGL is a software interface to graphics hardware. It is designed as a hardware-independent interface to be used for many different hardware platforms. OpenGL programs can also work across a network (client-server paradigm) even if the client and server are different kinds of computers. The client in OpenGL is a computer on which an OpenGL program actually executes, and the server is a computer that performs the drawings.

OpenGL uses the prefix gl for core OpenGL commands and glu for commands in OpenGL Utility Library. Similarly, OpenGL constants begin with GL_ and use all capital letters. OpenGL also uses suffix to specify the number of arguments and data type passed to a OpenGL call.
glColor3f(1, 0, 0);         // set rendering color to red with 3 floating numbers
glColor4d(0, 1, 0, 0.2);    // set color to green with 20% of opacity (double)
glVertex3fv(vertex);        // set x-y-z coordinates using pointer

State Machine

OpenGL is a state machine. Modes and attributes in OpenGL will be remained in effect until they are changed. Most state variables can be enabled or disabled with glEnable() or glDisable(). You can also check if a state is currently enabled or disabled with glIsEnabled(). You can save or restore a collection of state variables into/from attribute stacks using glPushAttrib() or glPopAttrib(). GL_ALL_ATTRIB_BITS parameter can be used to save/restore all states. The number of stacks must be at least 16 in OpenGL standard.
(Check your maximum stack size with glinfo.)
 
glPushAttrib(GL_LIGHTING_BIT);    // elegant way to change states because
    glDisable(GL_LIGHTING);       // you can restore exact previous states
    glEnable(GL_COLOR_MATERIAL);  // after calling glPopAttrib()
glPushAttrib(GL_COLOR_BUFFER_BIT);
    glDisable(GL_DITHER);
    glEnable(GL_BLEND);



glPopAttrib();                    // restore GL_COLOR_BUFFER_BIT
glPopAttrib();                    // restore GL_LIGHTING_BIT

glBegin() and glEnd()

In order to draw geometric primitives (points, lines, triangles, etc) in OpenGL, you can specify a list of vertex data between glBegin() and glEnd(). This method is called immediate mode. (You may draw geometric primitives using other methods such as vertex array.)
glBegin(GL_TRIANGLES);
    glColor3f(1, 0, 0);     // set vertex color to red
    glVertex3fv(v1);        // draw a triangle with v1, v2, v3
    glVertex3fv(v2);
    glVertex3fv(v3);
glEnd();
There are 10 types of primitives in OpenGL; GL_POINTS, GL_LINES, GL_LINE_STRIP, GL_LINE_LOOP, GL_TRIANGLES, GL_TRIANGLE_STRIP, GL_TRIANGLE_FAN, GL_QUADS, GL_QUAD_STRIP, and GL_POLYGON.

glFlush() & glFinish()

Similar to computer IO buffer, OpenGL commands are not executed immediately. All commands are stored in buffers first, including network buffers and the graphics accelerator itself, and are awaiting execution until buffers are full. For example, if an application runs over the network, it is much more efficient to send a collection of commands in a single packet than to send each command over network one at a time.
glFlush() empties all commands in these buffers and forces all pending commands will to be executed immediately without waiting buffers are full. Therefore glFlush() guarantees that all OpenGL commands made up to that point will complete executions in a finite amount time after calling glFlush(). And glFlush() does not wait until previous executions are complete and may return immediately to your program. So you are free to send more commands even though previously issued commands are not finished.
glFinish() flushes buffers and forces commands to begin execution as glFlush() does, but glFinish() blocks other OpenGL commands and waits for all execution is complete. Consequently, glFinish() does not return to your program until all previously called commands are complete. It might be used to synchronize tasks or to measure exact elapsed time that certain OpenGL commands are executed.

glVertexArray Explained

Instead you specify individual vertex data in immediate mode (between glBegin() and glEnd() pairs), you can store vertex data in a set of arrays including vertex coordinates, normals, texture coordinates and color information. And you can draw geometric primitives by dereferencing the array elements with array indices.
Take a look the following code to draw a cube with immediate mode.
Each face needs 4 times of glVertex*() calls to make a quad, for example, the quad at front is v0-v1-v2-v3. A cube has 6 faces, so the total number of glVertex*() calls is 24. If you also specify normals and colors to the corresponding vertices, the number of function calls increases to 3 times more; 24 of glColor*() and 24 of glNormal*().
The other thing that you should notice is the vertex "v0" is shared with 3 adjacent polygons; front, right and up face. In immediate mode, you have to provide the shared vertex 3 times, once for each face as shown in the code.
glBegin(GL_QUADS);      // draw a cube with 6 quads
    glVertex3fv(v0);    // front face
    glVertex3fv(v1);
    glVertex3fv(v2);
    glVertex3fv(v3);

    glVertex3fv(v0);    // right face
    glVertex3fv(v3);
    glVertex3fv(v4);
    glVertex3fv(v5);

    glVertex3fv(v0);    // up face
    glVertex3fv(v5);
    glVertex3fv(v6);
    glVertex3fv(v1);

    ...                 // draw other 3 faces

glEnd();
Using vertex arrays reduces the number of function calls and redundant usage of shared vertices. Therefore, you may increase the performance of rendering. Here, 3 different OpenGL functions are explained to use vertex arrays; glDrawArrays(), glDrawElements() and glDrawRangeElements(). Although, better approach is using vertex buffer objects (VBO) or display lists.

Initialization

OpenGL provides glEnableClientState() and glDisableClientState() functions to activate and deactivate 6 different types of arrays. Plus, there are 6 functions to specify the exact positions(addresses) of arrays, so, OpenGL can access the arrays in your application.
  • glVertexPointer():  specify pointer to vertex coords array
  • glNormalPointer():  specify pointer to normal array
  • glColorPointer():  specify pointer to RGB color array
  • glIndexPointer():  specify pointer to indexed color array
  • glTexCoordPointer():  specify pointer to texture cords array
  • glEdgeFlagPointer():  specify pointer to edge flag array
Each specifying function requires different parameters. Please look at OpenGL function manuals. Edge flags are used to mark whether the vertex is on the boundary edge or not. Hence, the only edges where edge flags are on will be visible if glPolygonMode() is set with GL_LINE.
Notice that vertex arrays are located in your application(system memory), which is on the client side. And, OpenGL on the server side gets access to them. That is why there are distinctive commands for vertex array; glEnableClientState() and glDisableClientState() instead of using glEnable() and glDisable().

glDrawArrays()

glDrawArrays() reads vertex data from the enabled arrays by marching straight through the array without skipping or hopping. Because glDrawArrays() does not allows hopping around the vertex arrays, you still have to repeat the shared vertices once per face.
glDrawArrays() takes 3 arguments. The first thing is the primitive type. The second parameter is the starting offset of the array. The last parameter is the number of vertices to pass to rendering pipeline of OpenGL.
For above example to draw a cube, the first parameter is GL_QUADS, the second is 0, which means starting from beginning of the array. And the last parameter is 24: a cube requires 6 faces and each face needs 4 vertices to build a quad, 6 × 4 = 24.

GLfloat vertices[] = {...}; // 24 of vertex coords
...
// activate and specify pointer to vertex array
glEnableClientState(GL_VERTEX_ARRAY);
glVertexPointer(3, GL_FLOAT, 0, vertices);

// draw a cube
glDrawArrays(GL_QUADS, 0, 24);

// deactivate vertex arrays after drawing
glDisableClientState(GL_VERTEX_ARRAY);
As a result of using glDrawArrays(), you can replace 24 glVertex*() calls with a single glDrawArrays() call. However, we still need to duplicate the shared vertices, so the number of vertices defined in the array is still 24 instead of 8. glDrawElements() is the solution to reduce the number of vertices in the array, so it allows transferring less data to OpenGL.
The size of vertex coordinates array is now 8, which is exactly same number of vertices in the cube without any redundant entries.
Note that the data type of index array is GLubyte instead of GLuint or GLushort. It should be the smallest data type that can fit maximum index number in order to reduce the size of index array, otherwise, it may cause performance drop due to the size of index array. Since the vertex array contains 8 vertices, GLubyte is enough to store all indices.

Different normals at shared vertex
Another thing you should consider is the normal vectors at the shared vertices. If the normals of the adjacent polygons at a shared vertex are all different, then normal vectors should be specified as many as the number of faces, once for each face.

For example, the vertex v0 is shared with the front, right and up face, but, the normals cannot be shared at v0. The normal of the front face is n0, the right face normal is n1 and the up face is n2. For this situation, the normal is not the same at a shared vertex, the vertex cannot be defined only once in vertex array any more. It must be defined multiple times in the array for vertex coordinates in order to match the same amount of elements in the normal array.

glDrawRangeElements()

Like glDrawElements(), glDrawRangeElements() is also good for hopping around vertex array. However, glDrawRangeElements() has two more parameters (start and end index) to specify a range of vertices to be prefetched. By adding this restriction of a range, OpenGL may be able to obtain only limited amount of vertex array data prior to rendering, and may increase performance.
The additional parameters in glDrawRangeElements() are start and end index, then OpenGL prefetches a limited amount of vertices from these values: end - start + 1. And the values in index array must lie in between start and end index. Note that not all vertices in range (start, end) must be referenced. But, if you specify a sparsely used range, it causes unnecessary process for many unused vertices in that range.
GLfloat vertices[] = {...};     // 8 of vertex coords
GLubyte indices[] = {0,1,2,3,   // 24 of indices
                     0,3,4,5,
                     0,5,6,1,
                     1,6,7,2,
                     7,4,3,2,
                     4,7,6,5};
...
// activate and specify pointer to vertex array
glEnableClientState(GL_VERTEX_ARRAY);
glVertexPointer(3, GL_FLOAT, 0, vertices);

// draw first half, range is 6 - 0 + 1 = 7 vertices
glDrawRangeElements(GL_QUADS, 0, 6, 12, GL_UNSIGNED_BYTE, indices);

// draw second half, range is 7 - 1 + 1 = 7 vertices
glDrawRangeElements(GL_QUADS, 1, 7, 12, GL_UNSIGNED_BYTE, indices+12);

// deactivate vertex arrays after drawing
glDisableClientState(GL_VERTEX_ARRAY);
You can find out maximum number of vertices to be prefetched and the maximum number of indices to be referenced by using glGetIntegerv() with GL_MAX_ELEMENTS_VERTICES and GL_MAX_ELEMENTS_INDICES.
Note that glDrawRangeElements() is available OpenGL version 1.2 or greater.