The World-Wide Web(WWW) has become one of the most important Information resources in this decade. Most information in the Web is in the HTML format. So far, HTML documents have been stored in file systems, Leading to inefficient document management. As the Web is growing too fast, it is becoming more and more difficult to find the right information in the Web. This is contrary to our common perception that the Web service is so easy to use that we can find anything we want in it. Thus, there is now indispensable need to devise a method for managing Web documents efficiently.
In this thesis we design and implement an HTML document storage and retrieval system using an object-oriented DBMS. The system manages HTML documents using the database and supports both structure-based content retrieval and structure-based attribute retrieval. The system is different from existing Internet search engines in a way that it supports structure-based retrieval of HTML documents. We present a data model that maps HTML documents to an object-oriented database schema. We also present a method to convert a query in the graphical user interface to an object-oriented database query. The system can support many new queries that have not been possible with the existing Internet search engines and can also be used as a new search engine in an Intranet environment.