SYNTHESIS OF VIRTUAL VIEWS OF A SCENE

Georges QUÉNOT,

Object: The problem addressed here is the one of the synthesis of arbitrary views of a scene from a limited set of real ones (two or three). It has several important applications including: animation of stererograms (depth is rendered through motion instead of static stereo), 3D movies, 3D TV and 3D teleconferencing (by dynamically synthesizing on a screen the adequate view according to the viewer head position). The virtual views are generated by enhanced image interpolation with motion compensation (kind of morphing) or by 3D surface reconstruction. Both methods rely on the use of an optical flow computation technique that provides a dense and accurate matching between the images.

Description: Optical flow computation consists in extracting a velocity field from an image sequence assuming that intensity (or color) is conserved during displacement. We chose an Optical Flow Computation technique based on dynamic programming [1] because it has the following characteristics that makes it especially suitable for the targeted application: Once a dense matching is obtained from a pair of images, a simple image interpolation (morphing) naturally renders views of the scene for intermediate viewpoints (whatever the relative camera positions are and without any camera information). If the pair is an horizontal stereo one, it is possible to render a motion in the vertical direction by adding a variable multiple of the horizontal displacement to the vertical one. It is also possible to render a motion in the direction of the scene using another simple geometric transformation on the displacement field. Combination of these three motions plus zoom, pan and rotation allows to create any virtual view. If the camera parameters are known, it is possible to obtain a depth map and then a textured 3D surface from the matching field. An alternate method to create virtual views of the scene is then to build facet based views of this surface. Both the interpolation and the 3D reconstruction methods can be used with three images. In this case, images are matched by pairs and information is fused.

Results and Prospects: Results obtained with these methods have been demonstrated at the CVPR'97 conference [2]. Some of them can be seen in ftp://ftp.limsi.fr/pub/quenot/demo from the LIMSI ftp server. The virtual views appear very realistic if they are not too far from the original ones. Figure 1 shows three images extracted from the Heinrich Hertz Institute ``anne'' sequences: left (a), right (b) and top (c) views, a central view (d) synthesized using image interpolation, two arbitrary views (e,f) synthesized by 3D surface reconstruction and facet based visualization, two side views (g) also synthesized by 3D surface reconstruction for which the viewpoints are quite far from the original ones, and the corresponding 3D surfaces (h), without mapped texture, represented by slices in three orthogonal directions (1 cm spaced in each direction). The same results can be seen also as MPEG animations: left original sequence, right original sequence, top original sequence, central synthesized sequence, synthesized 3D surface sequence, arbitrary synthesized sequence 1, arbitrary synthesized sequence 2,
Current limitations are that the the original viewpoints must be rather close to each other and that optical flow computation is slow. Further investigations are carried out to increase the matching accuracy and greatly reduce the computation time by exploiting the camera parameters (epipolar constraints).

References:
[1] Georges M. Quénot, Computation of Optical Flow Using Dynamic Programming [1536721 bytes], IAPR Workshop on Machine Vision Applications, Tokyo, Japan, 12-14 nov 1996. Abstract [1104 bytes].
[2] Georges M. Quénot, Computation of Optical Flow Using Dynamic Programming and Applications [12585 bytes], Computer Vision and Pattern Recognition, Demo Program p. 5, San Juan, Puerto Rico, 17-19 june 1997. Abstract [894 bytes].