SPFE Documentation | Collections > Essays on SPFE > Namespaces in SPFE

Namespaces in SPFE

By Mark Baker

Summary

SPFE Open Toolkit uses namespaces extensively. Namespaces express the ownership of a document type. They do not guarantee that an element name is globally unique, or even unique within a namespace. Some XML vocabularies assign one namespace to a single document type. The SPFE Open Toolkit assigns one namespace to a set of related document types and used a chameleon schema model for building schemas from component pieces. However, you can use namespace however you like when creating your own schemas.

SPFE Open Toolkit makes extensive use of XML namespaces. It does not force your to use namespaces for your own content, but I do recommend that you do. You have some choices in how you use namespaces. The way they are used in EPPO-simple and the SPFE documentation is one choice, but there are others. The purpose of this essay is to explain the choices and why EPPO-simple and the SPFE documentation use namespaces the way they do.

Understanding namespaces

XML Namespaces are a source of considerable confusion. This is not because they are complex, but because they are simpler than we expect them to be, and because, as a late edition to the XML family, they are not treated entirely consistently across other XML technologies. They do less than we expect them to, and they do it in inconsistent ways. Nonetheless, they perform an important function.

The function of XML namespaces is to provide an expanded name for an XML element that can distinguish it from the same local name used in other XML vocabularies. Thus, for instance, the name element is used in both XML schema and in XSLT. In XML Schema is it used to define an element name in a schema. In XSLT it is used to create a new element in the output. Despite having the same name, they are different elements because the element in XML schema is in the http://www.w3.org/2001/XMLSchema namespace and the element in XSLT is in the http://www.w3.org/1999/XSL/Transform namespace. Similarly, many different XML vocabularies define elements with names like title, name, section, link, or table. By giving each of those vocabularies a different namespace, you can tell which title, name, section, link, or table you are looking at. Namespaces names are intended to be globally unique, so there should be no confusion about which vocabulary each element name belongs to.

This makes is sound like namespaces make element names unique in the world. But they don't, and this is one possible source of confusion. Namespace names themselves are supposed to be globally unique. This is why they are usually based on domain names, since domain names are guaranteed to be globally unique thanks to the global domain name registration system. (Note that there is no global registry for XML namespaces themselves.) But adding a globally unique namespace name to an XML element name does not make that element name globally unique, because XML element names are not unique within their own namespace.

In fact, the same element name can occur several times in an XML vocabulary, and can mean different things in each case. A common example is to have an element named name which is used to capture the name of several different things described by the vocabulary. In the SPFE configuration vocabulary, for example, there are several elements named name, and every one of them is part of the SPFE configuration namespace, http://spfeopentoolkit.org/ns/spfe-ot/config. What distinguishes them is their position relative to other elements in a configuration document. They include:

So, having a element name and namespace does not tell you what that element means, or even which element it is. For that, you have to know both the namespace and the context in which the element occurs.

Namespaces, therefore, are not about uniquely identifying an element. Namespaces are actually about ownership. A namespace name tells you who the element name belongs to. Whether or not the name is unique within the namespace is entirely up to the person who owns the namespace. (And yes, that does mean that, technically speaking, XML namespaces are not really namespaces.)

Typically, ownership of a namespace is expressed in terms of the organization and the project within that organization that created the vocabulary that the namespace is assigned to. Thus the namespace identifier, http://www.w3.org/2001/XMLSchema states that it belong to the W3C and to the XML Schema project within the W3C. The SPFE configuration namespace, http://spfeopentoolkit.org/ns/spfe-ot/config states that it belongs to the SPFE Open Toolkit, and to the configuration project within the SPFE Open Toolkit.

Document types and namespaces

You may have noticed that to this point I have not used the phrase “document type”. I have used “vocabulary” to describe a set of XML elements designed to be used together. We may think of schemas and DTDs as defining document types, and that is what we generally use them to do. But actually, neither schemas not DTDs define document types. Rather, documents declare what type of document they are, by choosing what root element to use. Schemas and DTDs simply give them elements to choose from to form their root element, and in many cases, they give them several choices.

With DTDs, any element defined in the DTD can be used by a document as a root element. Despite its name, a DTD is really defining a nested set of elements and a document is free to choose any element anywhere in that nest as its root element. XML Schemas are a bit more restrictive. You can only choose your root element from among those elements declared at the root of the schema. Elements declared inside type definitions cannot be used. This lets you write a schema with only one root element, which effectively confines it to defining a single document type. But one schema document can define multiple root elements. In fact, a schema or a DTD can define multiple root elements that share no child elements at all -- effectively defining multiple separate document types.

In short, schemas, and namespaces, are all about elements. A document, and a document type, is something that we construct out of the raw material of nested elements that schemas and namespaces present to us. A document types is, therefore, not something with a strict definition in XML that we have to adhere to. It is something we can construct for ourselves in ways that meet our own needs. XML just says: here are elements, and here are the rules for elements and their names; make what you want out of them.

This ability to define for ourselves what document types are is powerful, but potentially confusing. It also forces us to make decisions about what document types mean to us and what rules we are going to apply to them.

One of those decisions is about the relationship between namespaces and document types. It is very tempting to think that every document type has a unique namespace and that is the end of the matter. A number of public document types, such as XML schema, XSLT, and HTML work this way. But there is no law that says it has to be this way. These public document types have very broad use and very firmly established semantics. And they don't share much with other public standards. Having unique namespaces for each of them makes sense because of the way they are used.

But if you are setting up a structured writing system, particularly a topic-based structured writing system, you are going to have multiple topic types. But each of those topic types is going to have a great many things in common. You are not going to have a whole different way of doing tables or lists or procedures across a family of closely related topic types. In fact, those topic types are probably going to share a whole bunch of elements that mean exactly the same thing in all of them. Why then should they be in different namespaces?

There are three ways to handle this, which this essay has named the Homogeneous, Heterogeneous, and Chameleon designs.