Book HomeXML SchemaSearch this book

7.5. Mixed Content Models

Although W3CXML Schema permits mixed content models and describes them better than in XML DTDS, W3CXML Schema treats them as an add-on plugged on top of complex content models. The good news is that this allows control of children elements exactly as we've just seen for complex contents. The bad news is that we abandon any control over the child text nodes whose values cannot be constrained at all, and, of course, the descriptions of the child elements are subject to the same limitations as in the case of complex content models. The limitations on unordered content models are probably even more unfriendly for mixed content models, which are more "free style," than the limitation is for complex content models.

7.5.1. Creating Mixed Content Models

This add-on is implemented through a mixed attribute in the xs:complexType(global definition), which is otherwise used exactly as we've seen for complex content models. The effect of this attribute when its value is set to "true" is to allow any text nodes within the content model, before, between, and after the child elements. The location, the whitespace processing, and the datatype of these text nodes cannot be restricted in any way.

Let's go back to the definition of our title element and change it to accept a reduced version of XHTML with the a link and an em element to highlight some parts of its text. The definition, which was previously done by extending a simple type to create a simple content complex type, needs to be re-written as a complex content definition with a mixed attribute set to "true". The full definition, including the definition of the a element, the definition of a markedText complex type and its usage to define the title element, could be:

<xs:element name="a">
  <xs:complexType>
    <xs:simpleContent>
      <xs:extension base="xs:string">
        <xs:attribute name="href" type="xs:anyURI"/>
      </xs:extension>
    </xs:simpleContent>
  </xs:complexType>
</xs:element>
          
<xs:complexType name="markedText" mixed="true">
  <xs:choice minOccurs="0" maxOccurs="unbounded">
    <xs:element name="em" type="xs:token"/>
    <xs:element ref="a"/>
  </xs:choice>
  <xs:attribute ref="lang"/>
</xs:complexType>
          
<xs:element name="title" type="markedText"/>

This definition matches elements such as:

<title lang="en">
  Being a
  <a href="http://dmoz.org/Shopping/Pets/Dogs/">
    Dog
  </a>
  Is a
  <em>
    Full-Time
  </em>
  Job
</title>

Note that the length of the title can no longer be restricted.

7.5.2. Derivation of Mixed Content Models

Mixed content models are derived exactly like the complex content models on which they have been plugged. The semantic of both methods stays exactly the same.

7.5.2.1. Derivation by extension

Mixed contents complex types can be derived by extension from other complex content complex types and the meaning will be the same. If I want to add a strong element to my markedText mixed content type, I can define the following content model:

<xs:element name="title">
  <xs:complexType mixed="true">
    <xs:complexContent mixed="true">
      <xs:extension base="markedText">
        <xs:choice minOccurs="0" maxOccurs="unbounded">
          <xs:element name="strong" type="xs:string"/>
        </xs:choice>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>
</xs:element>

One must note, though, that this extension is equivalent to:

<xs:complexType name="resultingType" mixed="true">
  <xs:sequence>
    <xs:choice minOccurs="0" maxOccurs="unbounded">
      <xs:element name="em" type="xs:token"/>
      <xs:element ref="a"/>
    </xs:choice>
    <xs:choice minOccurs="0" maxOccurs="unbounded">
      <xs:element name="strong" type="xs:string"/>
    </xs:choice>
  </xs:sequence>
  <xs:attribute ref="lang"/>
</xs:complexType>

This is probably what we would like to see in practice since this content model expects to see all the occurrences of a and em before any instance of strong. We will see later, in Chapter 12, "Creating More Building Blocks Using Object-Oriented Features", that this specific issue can be solved using a feature named "substitution groups" instead of using xs:choice.

7.5.2.2. Derivation by restriction

The derivation of mixed content models by restriction is also done using the method defined for complex content models, with the same constraint that each particle must be an explicit derivation of the corresponding particle of the base type. To illustrate the consequences of this constraint, let's look again at the definition and the use of our markedText:

<xs:element name="a">
  <xs:complexType>
    <xs:simpleContent>
      <xs:extension base="xs:string">
        <xs:attribute name="href" type="xs:anyURI"/>
      </xs:extension>
    </xs:simpleContent>
  </xs:complexType>
</xs:element>
             
<xs:complexType name="markedText" mixed="true">
  <xs:choice minOccurs="0" maxOccurs="unbounded">
    <xs:element name="em" type="xs:token"/>
    <xs:element ref="a"/>
  </xs:choice>
  <xs:attribute ref="lang"/>
</xs:complexType>
             
<xs:element name="title" type="markedText"/>

If we want to forbid em elements in our title, force the href to be an http absolute URI, and require the lang attribute to be either en or es, we need to do some refactoring to show that the a element included in our title is an explicit derivation of the general definition of a. We also need to use a global complex type definition for a instead of the previous anonymous definition:

<xs:element name="a" type="link"/>

We can now either derive a new global complex type from the new link complex type or embed its derivation in the definition of our title element:

<xs:element name="title">
  <xs:complexType mixed="true">
    <xs:complexContent mixed="true">
      <xs:restriction base="markedText">
        <xs:choice minOccurs="0" maxOccurs="unbounded">
          <xs:element name="a">
            <xs:complexType>
              <xs:simpleContent>
               <xs:restriction base="link">
               <xs:attribute name="href">
               <xs:simpleType>
               <xs:restriction base="xs:anyURI">
               <xs:pattern value="http://.*"/>
               </xs:restriction>
               </xs:simpleType>
               </xs:attribute>
               </xs:restriction>
              </xs:simpleContent>
            </xs:complexType>
          </xs:element>
        </xs:choice>
        <xs:attribute name="lang">
          <xs:simpleType>
            <xs:restriction base="xs:language">
              <xs:enumeration value="en"/>
              <xs:enumeration value="es"/>
            </xs:restriction>
          </xs:simpleType>
        </xs:attribute>
      </xs:restriction>
    </xs:complexContent>
  </xs:complexType>
</xs:element>

This example is a caricature. In practice it would be more readable to create an intermediate global type definition to avoid embedding several derivations, but it provides an overview of this derivation process.

7.5.2.3. Derivation between complex and mixed content models

Since complex and mixed content models are built using the same mechanism, one may wonder what the possibilities are for deriving complex contents from mixed contents and vice versa. The answer to this question lurks in the semantic of these two derivation methods.

Derivation by extension appends new content after the content of the base type and the structure of the base type is kept unchanged. It is therefore not possible to derive a mixed content model from complex content model. When a content model is mixed, the position of the text nodes cannot be constrained, and this permits text nodes within the base type at any location. For the same reason, it is impossible to extend a mixed content model into a complex content model because the text nodes that are allowed in the base type would become forbidden.

Derivation by restriction defines a subset of the base type. It is forbidden to derive a mixed content model from a complex content model The resulting type would allow text nodes that are forbidden in the base type and would expand rather than restrict the content model. There is one workable possibility, however. The last combination is the only possible one: a mixed content model can be restricted into a complex content model. Forbidding the text nodes of a mixed content model is a valid restriction and can be done by setting the mixed attribute to "false" in the xs:complexType definition. It is even possible to derive a simple content model into a mixed content model since this is, in fact, a restriction removing the sibling elements and keeping the text nodes. This assumes, of course, that the sibling elements are optional; i.e., they have a minOccurs attribute equal to 0.



Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.