Approximate Matching Between XML Documents and Schemas with Applications in XML Classification and Clustering
Approximate Matching Between XML Documents and Schemas with Applications in XML Classification and Clustering
Classification/clustering of XML documents based on their structural information is important for many tasks related with document management. In this chapter, we present a suite of algorithms to compute the cost for approximate matching between XML documents and schemas. A framework for classifying/clustering XML documents by structure is then presented based on the computation of distances between XML documents and schemas. The backbone of the framework is the feature representation using a vector of the distances. Experimental studies were conducted on various XML data sets, suggesting the efficiency and effectiveness of our approach as a solution for structural classification/clustering of XML documents.
CITATION: Xing, Guangming. Approximate Matching Between XML Documents and Schemas with Applications in XML Classification and Clustering edited by Tagarelli, Andrea . Hershey, PA : IGI Global , 2011. XML Data Mining - Available at: https://library.au.int/approximate-matching-between-xml-documents-and-schemas-applications-xml-classification-and