Earlier, the xs:simpleContent element was used to declare an element that could only contain simple content:
<xs:element name="fullName"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute name="language" type="xs:language"/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element>
The base type for the extension in this case was the built-in xs:string data type. But simple types are not limited to the predefined types. The xs:simpleType element can define new simple data types, which can be referenced by element and attribute declarations within the schema.
To show how new simple types can be defined, let's extend the phone element from the example application to support a new attribute called location. This attribute will be used to differentiate between work and home phone numbers. This attribute will have a new simple type called locationType, which will be referenced from the contactsType definition:
<xs:complexType name="contactsType"> <xs:sequence> <xs:element name="phone" minOccurs="0"> <xs:complexType> <xs:attribute name="number" type="xs:string"/> <xs:attribute name="location" type="addr:locationType"/> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> <xs:simpleType name="locationType"> <xs:restriction base="xs:string"/> </xs:simpleType>
Of course, a location type that just maps to the built-in xs:string type isn't particularly useful. Fortunately, schemas can strictly control the possible values of simple types through a mechanism called facets.
In schema-speak, a facet is an aspect of a possible value for a simple data type. Depending on the base type, some facets make more sense than others. For example, a numeric data type can be restricted by the minimum and maximum possible values it could contain. But these types of restrictions wouldn't make sense for a boolean value. The following list covers the different facet types that are supported by a schema processor:
length (or minLength and maxLength)
pattern
enumeration
whiteSpace
maxInclusive and maxExclusive
minInclusive and minExclusive
totalDigits
fractionDigits
Facets are applied to simple types using the xs:restriction element. Each facet is expressed as a distinct element within the restriction block, and multiple facets can be combined to further restrict potential values of the simple type.
The whiteSpace facet controls how the schema processor will deal with any whitespace within the target data. Whitespace normalization takes place before any of the other facets are processed. There are three possible values for the whiteSpace facet:
The length-restriction facets are fairly easy to understand. The length facet forces a value to be exactly the length given. The minLength and maxLength facets can be used to set a definite range for the lengths of values of the type given. For example, take the nameComponent type from the schema. What if a name component could not exceed 50 characters (because of a database limitation, for instance)? This rule can be enforced by using the maxLength facet. Incorporating this facet requires a new simple type to reference from within the nameComponent complex type definition:
<xs:complexType name="nameComponent"> <xs:simpleContent> <xs:extension base="addr:nameString"/> </xs:simpleContent> </xs:complexType> <xs:simpleType name="nameString"> <xs:restriction base="xs:string"> <xs:maxLength value="50"/> </xs:restriction> </xs:simpleType>
The new nameString simple type is derived from the built-in xs:string type, but can contain no more than 50 characters (the default is unlimited). The same approach can be used with the length and minLength facets.
One of the more useful types of restriction is the simple enumeration. In many cases, it is sufficient to restrict possible values for an element or attribute to a member of a predefined list. For example, values of the new locationType simple type defined earlier could be restricted to a list of valid options like so:
<xs:simpleType name="locationType"> <xs:restriction base="xs:string"> <xs:enumeration value="work"/> <xs:enumeration value="home"/> <xs:enumeration value="mobile"/> </xs:restriction> </xs:simpleType>
Then, if the location attribute in any instance document contained a value not found in the list of enumeration values, the schema processor would generate a validity error.
Almost half of the of built-in data types defined by the schema specification represent numeric data of one type or another. More might be called numeric since the date/time and duration types are considered to be scalar quantities as well. The following two sections cover all of the numeric facets available, but for a comprehensive list of which of these facets are applicable to which data types, see Chapter 21.
Four facets control the minimum and maximum values of items:
The primary difference between the inclusive and exclusive flavors of the min and max facets is whether the value given is considered part of the set of allowable values. For example, the following two facet declarations are equivalent:
<xs:maxInclusive value="1"/> <xs:maxExclusive value="0"/>
The difference between inclusive and exclusive becomes more significant when dealing with decimal or floating point values. For example, if minExclusive were set to 5.0, the equivalent minInclusive value would require an infinite number of nines to the right of the decimal point (4.99999). These facets can also be applied to date and time values.
There are two facets that control the length and precision of decimal numeric values: totalDigits and fractionDigits . The totalDigits facet determines the total number of digits (only digits are counted, not signs or decimal points) that are allowed in a complete number. fractionDigits determines the number of those digits that must appear to the right of the decimal point in the number.
The xs:pattern facet can place very sophisticated restrictions on the format of string values. The pattern facet compares the value in question against a regular expression, and if the value doesn't conform to the expression, it generates a validation error. For example, this xs:simpleType element declares a social security number simple type using the pattern facet:
<xs:simpleType name="ssn"> <xs:restriction base="xs:string"> <xs:pattern value="\d\d\d-\d\d-\d\d\d\d"/> </xs:restriction> </xs:simpleType>
This new simple type enforces the rule that a social security number consists of three digits, a dash followed by two digits, another dash, and finally four more digits. The actual regular-expression language is very similar to that of the Perl programming language, but it also supports a wide range of Unicode characters. See Chapter 21 for more information on the full pattern-matching language.
XML 1.0 provided a few very simple list types that could be declared as possible attribute values: IDREFS, ENTITIES, and NMTOKENS. Schemas have generalized the concept of lists and provide the ability to declare lists of arbitrary types.
These list types are themselves simple types and may be used in the same places other simple types are used. For example, if the fullName element were to be expanded to accommodate multiple middle names, one approach would be to declare the middle element to contain a list of nameString values:
<xs:element name="middle" type="addr:nameList" minOccurs="0"/> . . . <xs:complexType name="nameList"> <xs:simpleContent> <xs:extension base="addr:nameListType"/> </xs:simpleContent> </xs:complexType> <xs:simpleType name="nameListType"> <xs:list itemType="addr:nameString"/> </xs:simpleType>
After this change has been made, the middle element of an instance document can contain an unlimited list of names, each of which can contain up to 50 characters separated by whitespace. The use of xs:complextype here will greatly simplify adding attributes later.
In some cases, it is useful to allow potential values for elements and attributes to have any of several types. The xs:union element allows a type to be declared that can draw from multiple type spaces. For example, it might be useful to allow users to enter their own one-word descriptions into the location attribute of the phone element, as well as to choose from a list. The location attribute declaration could be modified to include a union that incorporated the locationType type and the xs:NMTOKEN types:
<xs:attribute name="location"> <xs:simpleType> <xs:union memberTypes="addr:locationType xs:NMTOKEN"/> </xs:simpleType> </xs:attribute>
Now the location attribute can contain either addr:locationType or xs:NMTOKEN content.
Copyright © 2002 O'Reilly & Associates. All rights reserved.