Pssst...How Fast Ray Picking Works in SceneJS
Normally a ray-pick is done with expensive computations to find intersections of rays with meshes and so forth.
SceneJS, however, uses a fast GPU-assisted technique that employs the colour buffer to help find the ray-intersection point, which
avoids those sorts of computations altogether.
I couldn’t find anybody else doing this in WebGL or OpenGL ES (maybe I should have looked harder?), so at first I thought it must be too good to be true. However, despite a small amount of numeric inaccuracy when the front and back clip planes are far apart, it seems work well enough, so I went with it.
- User ray-picks canvas at coordinates (X, Y).
- Do a render pass to a hidden frame buffer, in which the objects within each
nameare rendered in a colour that uniquely maps to that
- Read the colour from the framebuffer at the canvas coordinates, map the colour back to the name value. Now we have the pick name.
- Do a second render pass to another hidden frame buffer, this time rendering just the picked geometry, with each pixel colour being the clip-space Z-value packed into an RGBA value.
- Read the colour from the framebuffer at the canvas coordinates and unpack it to the clip-space Z value. Now we have the clip-space Z, which will be in the range of [0..1], with near clip plane at 0 and far clip plane at 1.
- Transform the canvas coordinates to clip-space. Make a ray from clip space (X,Y,0) to (X,Y,1) and transform that ray into world-space by the inverse view and projection matrices.
- Linearly interpolate along ray by the value of our clip-space Z, to find the world-space coordinate (X,Y,Z).
- Voila, we have the picked name, canvas (X,Y) and world-space (X,Y,Z) for the pick hit.
Packing clip-space Z in GLSL
Calculating clip-space Z in the fragment shader
Step (4) requires that we have the view-space position in the fragment shader, which we pass through from the vertex shader. It also requires us to feed the locations of the near and far clipping planes into the fragment shader (which we take from the scene’s camera node). Using these, we calculate the clip-space depth like so:
- SceneJS internally caches the hidden frame buffers to avoid re-rendering them. This means that when we do a subsequent pick, as long as a re-render is not neccessary after objects have moved or changed appearance, we just re-read the buffers without repeating any rendering passes.
- For picking, many WebGL frameworks will save time by doing a picking render of only a 1x1 viewport at the canvas coordinates. SceneJS renders the entire view for picking so that it can cache the pick framebuffers as just mentioned. This is an optimisation geared towards fast mouse-over picking effects in model viewing apps, such as highlighting and tooltips.
Drawbacks to this technique
- The technique described here trades accuracy for speed. Packing and unpacking the clip-space Z to and from a colour value is lossy. Hopefully in future it will be possible to instead read the WebGL depth buffer, which will preserve much more precision.
- Another limitation is that this technique does not find any topological information on the pick hit: it only finds the name and a world-space coordinate. When picking a mesh for example, it does not report the actual face that was picked.