Schemaless Semistructured Data Revisited --Reinventing Peter Buneman's Deterministic Semistructured Data Model--
by Keishi Tajima
Abstract
This paper reviews the design of data models for semistructured data,
particularly focusing on their schemaless nature. Uniform treatment
of schema information and data, in other words, uniform treatment of
metadata and data, is important in the design of such data models.
This paper discusses what data and metadata are, and argues that
attribute names, which are usually regarded as metadata, and key
values, which are usually regarded as data, play similar roles when we
organize large data sets. The paper revises one of the standard
semistructured data models in accordance with that argument, and
eventually reinvents the deterministic semistructured data model
proposed by Peter Buneman and his colleagues. The contribution of
this paper is an additional rationale of the design of that data
model, a rationale based on the similarity between attribute names and
key values.