XML is spotlighted in the field of Electronic Data Exchange. Recently, XML documents are used in many kinds of applications, and they will be used in much more applications in near future. However, there are hundred millions of pages written in HTML in the web, and still, HTML is used for making web documents. Therefore, it would be convenient if the application can uniformly manage information stored in HTML document and XML document distributively. And, it would be much more convenient if the application can understand the semantics of documents or execute queries on existing HTML documents by transforming them to XML documents. To perform these kinds of operations, translating existing HTML documents to XML documents and storing translated XML documents are needed. If the translation can be performed automatically and fully, then users are easy to manage the translation of HTML documents to XML documents. However, there are some limitations for full and automatic translation since a program is not able to understand the semantics of HTML documents. This means that it is difficult for the XML document translated fully and automatically to adapt the semantics of HTML document correctly, and to be the document with the semantics that the users may want. Thus, in this paper, we extract structural information of HTML document automatically, and then construct the XML document. Also, the constructed XML document can be manipulated by the user through DTD. By doing so, we can take advantage of making the mostly wanted documents by the users while reducing the interactions with the users. Also, it can manage the query processing efficiently compared to the one that is built fully and automatically.