This dissertation proposes indexing techniques using multidimensional file organizations for object-oriented databases. Since most conventional indexing techniques for object-oriented databases use a one-dimensional index structure such as the $B^{+}$-tree, they do not often handle complex queries involving both an attribute and the class hierarchy. Physical database design for multidimensional file organizations is defined as the problem of determining the region splitting strategy for constructing an optimal multidimensional file organization that minimizes the cost of processing a given query pattern. Recently, many multidimensional file organizations have been proposed in the literature. However, there has been no effort for their physical database design. In this dissertation, we first propose a physical database design methodology for the optimal configuration of multidimensional file organizations, and then propose a tunable two-dimensional class indexing technique for object-oriented databases applying this design methodology. We then propose a multidimensionally extended scheme of this indexing technique for indexing nested attributes.
First, we show that the performance of query processing is highly affected by the similarity between the shapes of query regions and page regions in the domain space, and then propose a physical database design method for finding the optimal configuration of the multidimensional file by controlling the interval ratio of the page regions along different axes to achieve the similarity. For performance evaluation, we run extensive experiments with the multilevel grid file, a multidimensional file organization, using various types of queries and object distributions. The results indicate that our proposed method builds optimal multilevel grid files regardless of the query types and object distributions. When the interval ratio of a two-dimensional query region is 1:1024, the performance of the proposed method is enhanced by as much as 7.5 times over that of the conventional method that has the interval ratio of 1:1 employing the cyclic splitting strategy. The performance is further enhanced for the query types having higher interval ratios.
Second, we propose a tunable two-dimensional class indexing technique applying the proposed physical database design methodology. We use two-dimensional file organizations as the index structure. This indexing technique deals with the problem of clustering objects in a two -dimensional domain space that consists of a key attribute domain and a class identifier domain. In conventional class indexing techniques using one- dimensional index structures such as the $B^{+}$-tree, the clustering property is exclusively owned by one attribute. Such one-dimensional structures can be classified into two categories by the way they choose the clustering domain. One is key clustering indexing that clusters objects according to their key values, and the other class clustering indexing that clusters objects according to their class identifiers. These indexing techniques do not handle efficiently the queries that address both the attribute keys and the class identifiers. Our proposed tunable two-dimensional class indexing technique enhances query performance by adjusting the degree of clustering between the key value domain and the class identifier domain based on precollected usage pattern about the queries. For performance evaluation, we first compare our tunable two-dimensional class indexing with the conventional key clustering indexing and the class clustering indexing as a cost model. We then verify the cost model through experiments using the multilevel grid file as the two-dimensional index, and show that our proposed method indeed builds optimal class index structures regardless of the query types and object distributions.
Third, we also propose an extended scheme of the tunable two-dimensional class indexing technique for indexing the nested attribute in object-oriented databases. In this scheme, we construct indexes using multidimensional file organizations that include one class identifier domain per class hierarchy on a path expression that defines the indexed nested attribute. This scheme supports efficiently queries that involve search conditions on the nested attribute represented by an extended path expression. An extended path expression is a one in which a class hierarchy can be substituted by an individual class or a subclass hierarchy in the hierarchy.