We propose new query optimization techniques for object-oriented database systems. Although new query features and new access methods are provided in object-oriented databases, most previous techniques did not consider the effects from these new features. To maximize the performance of object-oriented database systems, efficient query optimization techniques that take account of these new features are crucial. Our query optimization techniques take account of the new query features and new access methods provided in object-oriented databases. They include an intermediate result cardinality,(IRC) estimation technique, a selectivity estimation technique, a join algorithm called OID Join,, and a query optimization algorithm.
IRC estimation technique reflects the effects of partial participation. Partial participation means that only a proper subset of objects in a class are related to the objects of another class. We first show that an object-oriented query often involves partial participation of classes in a relationship and compare it with the case of relational queries. We then present a new technique for estimating the IRC in such a query. Partial participations have not been considered seriously in the previous methods.
Selectivity estimation technique reflects the effects of many-to-many relationships and partial participations. Unlike relational databases, many-to-many relationships are often represented directly via a multi-valued attribute and object sharing in object-oriented databases. Hence, selectivity estimation should take into account the effect of many-to-many relationships. Many-to-many relationships have not been considered seriously in the previous methods. We analyze the effect of many-to-many relationships and propose a new selectivity estimation technique. We finally present some application area of the proposed technique in database performance analysis.
We compare the accuracy of the proposed IRC and selectivity estimation techniques with those of conventional ones. The result shows that the proposed techniques provide significant gain in accuracy. We also present efficient methods for obtaining detailed statistics to be used for accommodating partial participation and many-to-many relationships.
The OID Join algorithm exploits multiple access methods such as pointers and multi-level indexes that are widely provided in object-oriented database systems. Earlier join algorithms in object-oriented database systems use only one access method for accessing a class, and thus might not able to utilize some profitable indexes in the process of the join. We also analyze the cost of our join algorithm and compare it with those of earlier join algorithms.
Query optimization algorithm chooses the optimal evaluation plan for a given query from alternative plans that may use multiple access methods for accessing a class. The algorithm reduces the search space by removing from the query graph the classes that are replaceable by using the multi-level indexes. We call this technique query graph reduction. The cost model in query optimization reflects the effects of both extended query features and the OID Join algorithm.
The most significant advantage of our IRC and selectivity estimation techniques is that the accuracy of the estimates are far enhanced with only a small overhead. Thus, the optimizer will be able to estimate the cost of an execution plan more accurately. Furthermore, multiple joins in a nested predicate can be efficiently processed by using the OID Join algorithm which replaces the joins with index-scans before the search is started.