Region-based coding of image sequences is currently under investigation as the principal contender for the next generation of coding techniques. Region-based algorithms segment image into a set of regions according to a given model and estimate their parameters (e.g. color, shape, and motion) which can then be encoded. In this dissertation, methods of image segmentation and motion parameter reduction for region-based video coding are presented.
First, a major difficulty in estimating general motion is that it requires a large area of support in order to achieve a good estimation. Unfortunately, when the supporting area is large, it is very likely to have multiple moving objects. Thus, object-based segmentation and general motion estimation are an interdependent problem. To solve the problem, we propose a multi-stage segmentation method which groups progressively the flow field into segments according to a hierarchy of motion models.
Second, we present a new morphological spatio-temporal segmentation algorithm. The algorithm incorporates luminance and motion information simultaneously, and uses morphological tools such as morphological filters and a watershed algorithm. For spatio-temporal segmentation, a simple joint marker extraction technique is proposed with a new joint similarity measure developed for the boundary decision. By incorporating spatial and temporal information simultaneously into the segmentation procedure, we can obtain visually meaningful segmentation results. Simulation results demonstrate the efficiency of the proposed method.
Third, a new scheme for coding the affine motion coefficients of the region-based video coder is presented. The core of this scheme includes the calculation of real translation motion and the line transformation of scaling and rotation motions. By the calculation of real translation, translation terms which have good statistical property and strong temporal correlation are generated. By the line transformation, the scaling and the rotation motions are converted into 2D vector elements which have a good correlation.
Finally, the proposed methods are applied for region -based video coding. From simulation results, the proposed region-based coding method gives a better coding performance than that of H.261. The knowledge about the shape of objects in a scene enables a better image reconstruction especially at object boundaries and therefore increases coding efficiency. Also the segmentation masks form the base for the content-based functionalities as envisaged by MPEG-4.