In this paper, a hierarchical segmentation method for tracking a semantic video object is proposed using mathematical morphology and watershed algorithm. In the proposed method, each hierarchy consists of 3 basic steps: First, markers are extracted on the simplified current frame. Second, region growing by a modified watershed algorithm is performed to get over-segmented regions. Finally, the segmented regions are classified into 3 categories, i.e., inside, outside or uncertain regions according to region probability values, which are acquired by the probability map calculated from an estimated motion field. Then, for the remaining uncertain regions, the above three steps are repeated at lower hierarchies with less simplified frames until every region is classified to a certain region. The proposed algorithm provides prospective results in video sequences such as 'Miss America', 'Clair', 'Akiyo', and 'Mother and daughter'.