Book HomePerl & XMLSearch this book

7.2. DOM Class Interface Reference

Since DOM is becoming the interface of choice in the Perl-XML world, it deserves more elaboration. The following sections describe class interfaces individually, listing their properties, methods, and intended purposes.

WARNING: The DOM specification calls for UTF-16 as the standard encoding. However, most Perl implementations assume a UTF-8 encoding. Due to limitations in Perl, working with characters of lengths other than 8 bits is difficult. This will change in a future version, and encodings like UTF-16 will be supported more readily.

7.2.1. Document

The Document class controls the overall document, creating new objects when requested and maintaining high-level information such as references to the document type declaration and the root element.

7.2.1.1. Properties

doctype

Document Type Declaration (DTD).

documentElement

The root element of the document.

7.2.1.2. Methods

createElement, createTextNode, createComment, createCDATASection, createProcessingInstruction, createAttribute, createEntityReference

Generates a new node object.

createElementNS, createAttributeNS (DOM2 only)

Generates a new element or attribute node object with a specified namespace qualifier.

createDocumentFragment

Creates a container object for a document's subtree.

getElementsByTagName

Returns a NodeList of all elements having a given tag name at any level of the document.

getElementsByTagNameNS (DOM2 only)

Returns a NodeList of all elements having a given namespace qualifier and local name. The asterisk character (*) matches any element or any namespace, allowing you to find all elements in a given namespace.

getElementById (DOM2 only)

Returns a reference to the node that has a specified ID attribute.

importNode (DOM2 only)

Creates a new node that is the copy of a node from another document. Acts like a "copy to the clipboard" operation for importing markup.

7.2.2. DocumentFragment

The DocumentFragment class is used to contain a document fragment. Its children are (zero or more) nodes representing the tops of XML trees. This class contrasts with Document, which has at most one child element, the document root, plus metadata like the document type. In this respect, DocumentFragment's content is not well-formed, though it must obey the XML well-formed rules in all other respects (no illegal characters in text, etc.)

No specific methods or properties are defined; use the generic node methods to access data.

7.2.3. DocumentType

This class contains all the information contained in the document type declaration at the beginning of the document, except the specifics about an external DTD. Thus, it names the root element and any declared entities or notations in the internal subset.

No specific methods are defined for this class, but the properties are public (but read-only).

7.2.3.1. Properties

name

The name of the root element.

entities

A NamedNodeMap of entity declarations.

notation

A NamedNodeMap of notation declarations.

internalSubset (DOM2 only)

The internal subset of the DTD represented as a string.

publicId (DOM2 only)

The external subset of the DTD's public identifier.

systemId (DOM2 only)

The external subset of the DTD's system identifier.

7.2.4. Node

All node types inherit from the class Node. Any properties or methods common to all node types can be accessed through this class. A few properties, such as the value of the node, are undefined for some node types, like Element. The generic methods of this class are useful in some programming contexts, such as when writing code that processes nodes of different types. At other times, you'll know in advance what type you're working with, and you should use the specific class's methods instead.

All properties but nodeValue and prefix are read-only.

7.2.4.1. Properties

nodeName

A property that is defined for elements, attributes, and entities. In the context of elements this property would be the tag's name.

nodeValue

A property defined for attributes, text nodes, CDATA nodes, PIs, and comments.

nodeType

One of the following types of nodes: Element, Attr, Text, CDATASection, EntityReference, Entity, ProcessingInstruction, Comment, Document, DocumentType, DocumentFragment, or Notation.

parentNode

A reference to the parent of this node.

childNodes

An ordered list of references to children of this node (if any).

firstChild, lastChild

References to the first and last of the node's children (if any).

previousSibling, nextSibling

The node immediately preceding or following this one, respectively.

attributes

An unordered list (NamedNodeMap) of nodes that are attributes of this one (if any).

ownerDocument

A reference to the object containing the whole document -- useful when you need to generate a new node.

namespaceURI (DOM2 only)

A namespace URI if this node has a namespace prefix; otherwise it is null.

prefix (DOM2 only)

The namespace prefix associated with this node.

7.2.4.2. Methods

insertBefore

Inserts a node before a reference child element.

replaceChild

Swaps a child node with a new one you supply, giving you the old one in return.

appendChild

Adds a new node to the end of this node's list of children.

hasChildNodes

True if there are children of this node; otherwise, it is false.

cloneNode

Returns a duplicate copy of this node. It provides an alternate way to generate nodes. All properties will be identical except for parentNode, which will be undefined, and childNodes, which will be empty. Cloned elements will all have the same attributes as the original. If the argument deep is set to true, then the node and all its descendants will be copied.

hasAttributes (DOM2 only)

Returns true if this node has defined attributes.

isSupported (DOM2 only)

Returns true if this implementation supports a specific feature.

7.2.5. NodeList

This class is a container for an ordered list of nodes. It is "live," meaning that any changes to the nodes it references will appear in the document immediately.

7.2.5.1. Properties

length

Returns an integer indicating the number of nodes in the list.

7.2.5.2. Methods

item

Given an integer value n, returns a reference to the nth node in the list, starting at zero.

7.2.6. NamedNodeMap

This unordered set of nodes is designed to allow access to nodes by name. An alternate access by index is also provided for enumerations, but no order is implied.

7.2.6.1. Properties

length

Returns an integer indicating the number of nodes in the list.

7.2.6.2. Methods

getNamedItem, setNamedItem

Retrieves or adds a node using the node's nodeName property as the key.

removeNamedItem

Takes a node with the specified name out of the set and returns it.

item

Given an integer value n, returns a reference to the nth node in the set. Note that this method does not imply any order and is provided only for unique enumeration.

getNamedItemNS (DOM2 only)

Retrieves a node based on a namespace-qualified name (a namespace prefix and local name).

removeNamedItemNS (DOM2 only)

Takes an item out of the list and returns it, based on its namespace-qualified name.

setNamedItemNS (DOM2 only)

Adds a node to the list using its namespace-qualified name.

7.2.7. CharacterData

This class extends Node to facilitate access to certain types of nodes that contain character data, such as Text, CDATASection, Comment, and ProcessingInstruction. Specific classes like Text inherit from this class.

7.2.7.1. Properties

data

The character data itself.

length

The number of characters in the data.

7.2.7.2. Methods

appendData

Appends a string of character data to the end of the data property.

substringData

Extracts and returns a segment of the data property from offset to offset + count.

insertData

Inserts a string inside the data property at the location given by offset.

deleteData

Sets the data property to an empty string.

replaceData

Changes the contents of data property with a new string that you provide.

7.2.8. Element

This is the most common type of node you will encounter. An element can contain other nodes and has attribute nodes.

7.2.8.1. Properties

tagname

The name of the element.

7.2.8.2. Methods

getAttribute, getAttributeNode

Returns the value of an attribute, or a reference to the attribute node, with a given name.

setAttribute, setAttributeNode

Adds a new attribute to the element's list or replaces an existing attribute of the same name.

removeAttribute, removeAttributeNode

Returns the value of an attribute and removes it from the element's list.

getElementsByTagName

Returns a NodeList of descendant elements who match a name.

normalize

Collapses adjacent text nodes. You should use this method whenever you add new text nodes to ensure that the structure of the document remains the same, without erroneous extra children.

getAttributeNS (DOM2 only)

Retrieves an attribute value based on its qualified name (the namespace prefix plus the local name).

getAttributeNodeNS (DOM2 only)

Gets an attribute's node by using its qualified name.

getElementsByTagNamesNS (DOM2 only)

Returns a NodeList of elements among this element's descendants that match a qualified name.

hasAttribute (DOM2 only)

Returns true if this element has an attribute with a given name.

hasAttributeNS (DOM2 only)

Returns true if this element has an attribute with a given qualified name.

removeAttributeNS (DOM2 only)

Removes and returns an attribute node from this element's list, based on its namespace-qualified name.

setAttributeNS (DOM2 only)

Adds a new attribute to the element's list, given a namespace-qualified name and a value.

setAttributeNodeNS (DOM2 only)

Adds a new attribute node to the element's list with a namespace-qualified name.

7.2.9. Attr

7.2.9.1. Properties

name

The attribute's name.

specified

If the program or the document explicitly set the attribute, this property is true. If it was set in the DTD as a default and not reset anywhere else, then it will be false.

value

The attribute's value, represented as a text node.

ownerElement (DOM2 only)

The element to which this attribute belongs.

7.2.10. Text

7.2.10.1. Methods

splitText

Breaks the text node into two adjacent text nodes, each with part of the original text content. Content in the first node is from the beginning of the original up to, but not including, a character whose position is given by offset. The second node has the rest of the original node's content. This method is useful for inserting a new element inside a span of text.

7.2.11. CDATASection

CDATA Section is like a text node, but protects its contents from being parsed. It may contain markup characters (<, &) that would be illegal in text nodes. Use generic Node methods to access data.

7.2.12. ProcessingInstruction

7.2.12.1. Properties

target

The target value for the node.

data

The data value for the node.

7.2.13. Comment

This is a class representing comment nodes. Use the generic Node methods to access the data.

7.2.14. EntityReference

This is a reference to an entity defined by an Entity node. Sometimes the parser will be configured to resolve all entity references into their values for you. If that option is disabled, the parser should create this node. No explicit methods force resolution, but some actions to the node may have that side effect.

7.2.15. Entity

This class provides access to an entity in the document, based on information in an entity declaration in the DTD.

7.2.15.1. Properties

publicId

A public identifier for the resource (if the entity is external to the document).

systemId

A system identifier for the resource (if the entity is external to the document).

notationName

If the entity is unparsed, its notation reference is listed here.

7.2.16. Notation

Notation represents a notation declaration appearing in the DTD.

7.2.16.1. Properties

publicId

A public identifier for the notation.

systemId

A system identifier for the notation.



Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.