Professor Fred DePiero

Range Registration

'TV with a Joystick'

This research focuses on a key technology that could improve real-time remote viewing applications, teleconferencing, or TV.

In current teleconferencing systems, the only possible viewpoints of a remote scene are those acquired from given camera perspectives.Future systems may be able to overcome this present limitation. The idea of user-selected viewpoints could be described as "TV with a joystick", and would add great flexibility to remote viewing systems.

A rudimentary and undesirable approach to multiuser viewing would be to provide each user with their own remote camera, pan & tilt positioner and communication channel. However, this method would not scale well for a large number of viewers. (This would clutter a scene with cameras and overload communications).

The approach favored here involves acquiring the location of surfaces in a scene (clouds of 3-D points) and then computing views for each user. Surface data could be broadcast to many clients, each of whom could choose their own independent viewpoint. This approach to teleconferencing is scalable for many users. This type of remote viewing system is a confluence of many technologies including range sensing, registration, networking, compression, and computer graphics. Internet-2 provides a great opportunity for remote viewing using the type of system under development here.


"So what does it take to acquire surface data? What is the format of a surface description?"

Sensors generate a 'point cloud' of measurements. These contain a series of (x,y,z) points, each of which may also be tagged with a color (r,g,b). Examples of this type of sensor data appear next. More info is available.

A standard color intensity image of a simple block scene, a range image of the same scene (lighter gray associated with more distant locations) and a pseudo- color version of the range image.

 

There are a number of ways to describe a surface. However given that we are working with sensor data, its desirable to work with a surface description that is directly compatible with sensor data. The surface description that we use is a voxel array. It is a 3-D discretization of space. Once point clouds have been stored in the voxel array the images can be rendered.


"These range images look useful, why not just put these data in the voxel array... When do we get the TV with the joystick?"

Now the bad news starts. A limitation with range sensors is that they are line-of-sight devices. So a single snapshot from a sensor does not capture all the surface information from a scene. This becomes apparent when images are rendered from locations other than the original sensor position.

The left intensity image was acquired directly from a (simulated) sensor. In the right image the viewpoint differed from the sensor location. The white areas in the right image are missing data. This illustrates the limitation of a line-of-sight sensor. Also note that the spatial quantization of these images was kept a bit coarse to illustrate a limitation of voxel-array processing when less memory is used.

 

So to avoid gaps in rendered images, a greater percentage of the surfaces in a scene must be observed. This can be accomplished using either multiple sensors or one sensor that is moved around. In either case a process of registration is needed.

Registration of the surface data 'stitches together' the data sets acquired by a sensor, to form a contiguous set that is expressed with respect to the same coordinate frame. Such data can describe large portions of a scene and may be placed in a voxel array to support image rendering.

An AVI Clip demonstrates the registration and visualization processing. Here, the sensor is swept to the left exposing new portions of the scene while the viewpoint remains fixed. The images are relatively low resolution, at 200x200 pixels. Note the reduction in occluded portions (white) of the scene as the sensor moves. Significant to this demonstration is the solid position of the objects in the scene. This is a direct result of the accuracy of registration processing. An area of improvement is the 'dirty wall' effect. This is caused by imperfections between the range and color data acquired by the sensor. Post processing techniques are under investigation to reduce this effect. More info on visualization is available.


"That looks good - do some registration!"

Performing registration in real-time is a key technology that is still missing. Many have been working on this problem. Early work began in 1981. Methods of range registration is the major focus of our work.

For some perspective, a system that acquired relatively small sensor images of 320x240 at 25 Hz (motion picture rate) would have to process 1900K surface data points per second. Our most recent results at 400K are good compared to other reported results, but still require a factor of ~5 improvement in speed to get near the goal of real-time remote viewing. On a related note, state-of-the-art sensors also need to improve to support reaching our goals for remote viewing (even with the small imagery described above). Our processing rate is on-par with the best range sensors of today, in terms of point/second processing rates. Hence the state-of-the-art in range sensing also needs to improve to support remote viewing with the image quality of today's TV.

Our research goals are:

  • Registration at rates ~25Hz, using both range and color imagery
  • Deterministic (and fast) processing for real-time processing.
  • Ability to handle relatively large jumps in scene content, to accommodate fast sensor movement.
  • Arbitrary scene content, no additional fiduciary marks
  • No sensors other than the range & color cameras required. (Although auxiliary sensors can be incorporated as available.)
  • Computing all 6 DOF, for translation and rotation.

Recently we have been working on a new method for registration that involves subgraph isomorphisms, for which we have developed an approximate solution. Our initial results seem quite promising. Summaries of other work in the field is also available. Efforts this year on the registration work include improving the accuracy and stability of my approach and an implementation on a pipelined 4-computer system. We are also investigating a visualization subsystem for real-time display of the registered scene data.


An example of the results of the registration algorithm appear below, computed using the SIPTool. The two scenes are of the Martian surface. Extracted graphs are shown for each of the rotated images. Matched portions of the graphs appear in white. The lower image shows the corresponding points in each image that can be used to find the 6 DOF transformation that relates the range images. More details are available.


"So what kind of applications would become possible?"

In 'TV with a Joystick' the viewer could control the viewpoint displayed. Some viewers might choose to watch the hands of a golfer, others the ball, others the whole putting green. Granted more complex scenes might have a high level of occlusion, despite multiple sensors - golf might be fine, but basketball a bit complex!

For tele-immersion, two such views could be computed, one for each eye. This would create the perception of depth for the user. The scenes would arrive in real-time and would be naturally colored.

Another application is tele-medicine, where this sort of system could provide useful flexibility. For example if a field technician positioned a range & color sensor over a patient’s wound, then a remote doctor could examine the injury. Furthermore, if the technician wore either an immersion display or a translucent overlay display, then the doctor’s viewpoint could be graphically presented to the sensor technician. This would allow the technician to anticipate the doctor’s viewing needs – in terms of standoff or locations, for example – and would permit a much more efficient viewing experience for the doctor. In another approach, the technician could possibly be replaced by a robot, which would use the doctor’s viewpoint as a basis for path planning when positioning the sensor.


Related Links:

E-Mail & More:

.

Disclaimer: Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(Fred DePiero) and do not necessarily reflect the views of the Office of Naval Research.