The Role of Schema and Document Matchings in XML Source Clustering
The Role of Schema and Document Matchings in XML Source Clustering
In recent years, there has been an increase in the volume and heterogeneity of XML data sources. Moreover, these information sources are often comprised of both schemas and instances of XML data. In this context, the need of grouping similar XML documents together has led to an increasing research on clustering algorithms for XML data. In this chapter, we present an overview of the most popular methods for clustering XML data sources, distinguishing between the intensional data level and the extensional data level, depending whether the sources to cluster are DTDs and XML schemas, or XML documents; in the latter case, we focus on the structural information of the documents. We classify and describe techniques for computing similarities among XML data sources, and discuss methods for clustering DTDs/XML schemas and XML documents.
CITATION: Meo, Pasquale De. The Role of Schema and Document Matchings in XML Source Clustering edited by Tagarelli, Andrea . Hershey, PA : IGI Global , 2011. XML Data Mining - Available at: https://library.au.int/role-schema-and-document-matchings-xml-source-clustering