In this thesis, a new signature file method which provides multi-key access to large data is proposed. The proposed method, which is based on signature clustering and term discrimination, improves the two-level signature file method proposed by Sacks-Davis et al. Better retrieval performance can be achieved by creating a separate and efficient access method for terms with high discriminatory power, and by clustering similar signatures on basis of these terms.
In addition, the proposed method is compared with Sacks-Davis' one in terms of retrieval cost, insertion cost, and storage requirements. According to the experimental results, although the proposed method incurs additional storage overhead, it outperforms Sacks-Davis' one with respect to retrieval cost.