Book HomeWeb Design in a NutshellSearch this book

9.2. Setting Up an HTML Document

The standard skeletal structure of an HTML document according to the HTML 4.01 specification is as follows:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
   "http://www.w3.org/TR/HTML4.01/strict.dtd">
<HTML>
  <HEAD>
    <TITLE>Document Title</TITLE>
  </HEAD>
  <BODY>
    Contents of Document...
  </BODY>
</HTML>

This document has three components: a document type declaration (<!DOCTYPE>), the header section (<head>), and the body of the document (<body>).

The HTML standard requires that the entire document appear within the <html> container, but most browsers can properly display the contents of the document even if these tags are omitted. All HTML documents are made up of two main structures, the head (also called the "header") and the body. The exception to this rule is when the document contains a frameset in place of the body. For more information about framesets, see Chapter 14, "Frames".

9.2.1. The Document Type Declaration

In order to be valid (i.e., to conform precisely to the HTML standard), an HTML document needs to begin with a document type declaration that identifies the version of HTML that is used in the document. There are three distinct versions of HTML 4.01 (Strict, Transitional, and Frameset), each defined by a distinct document type definition (DTD). The DTD documents live on the W3C server at a stable URL.

The document's DTD is specified at the beginning of the document using the SGML declaration <!DOCTYPE> (document type). The remainder of the declaration contains two methods for pointing to DTD information: one a publicly recognized document, the other a specific URL in case the browsing device does not recognize the public identifier.

Strict

If you are following the Strict version of HTML 4.01 (the version that omits all deprecated and browser-specific tags), use this document type definition:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
   "http://www.w3.org/TR/HTML4.01/strict.dtd">
Transitional

If your document includes deprecated tags, point to the Transitional DTD using this document type definition:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
   "http://www.w3.org/TR/HTML4.01/loose.dtd">
Frameset

If your document uses frames, then identify the Frameset DTD. The Frameset DTD is the same as the Transitional version (it includes deprecated yet supported tags), with the addition of frame-specific tags.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"
   "http://www.w3.org/TR/HTML4.01/frameset.dtd">

DOCTYPE and Standards-Compliant Browsers

Until recently, it was recommended that HTML documents begin with a DOCTYPE declaration, but it wasn't put to much practical use. That has changed, and now you can use DOCTYPE to make the latest browser versions live up to their full potential.

Netscape 6, Internet Explorer 6 (Windows), and Internet Explorer 5 (Mac) switch into a strict, standards-compliant mode when they detect a DOCTYPE specifying the Strict HTML 4.01 DTD. By placing this declaration at the beginning of your document, you can write your documents and style sheets according to the standards and have confidence that they will work the way they should in these latest browsers. This is a great way to get started using standards-compliant code right away.

If the DOCTYPE declaration is missing or set to Transitional, these browsers revert to their legacy behavior of allowing the nonstandard code, intricate hacks, and common workarounds that are common in current web authoring practices. This allows new browsers to display existing documents properly.

9.2.2. The Document Header

The header, delimited by the <head> tag, contains information that describes the HTML document. The head tag has no attributes of its own; it merely serves as a container for other tags that help define and manage the document's contents.

9.2.2.1. Titles

The most commonly used element within the header is the document title (within <title> tags, as shown in the example above), which provides a description of the page's contents. In HTML 4.01, this is a required element, which means that every HTML document must have a meaningful title in its header. The title is typically displayed in the top bar of the browser, outside the regular content window.

Titles should contain only ASCII characters (letters, numbers, and basic punctuation). Special characters (such as &) should be referred to by their character entities within the title, for example:

<TITLE>The Adventures of Peto &amp; Fleck</TITLE>

TIP

The title is what's displayed in a user's bookmarks or "hot list." Search engines rely heavily on document titles as well. For these reasons, it's important to provide thoughtful and descriptive titles for all your documents and avoid vague titles like "Welcome" or "My Page."

9.2.2.2. Other header elements

Other useful HTML elements are also placed within <head> tags of a document:

<base>

This tag establishes the document's base location, which serves as a reference for all pathnames and links in the document. For more information, see Chapter 11, "Creating Links".

<isindex>

Deprecated. This tag was once used to add a simple search function to a page. It has been deprecated by HTML 4.01 in favor of form inputs.

<link>

This tag defines the relationship between the current document and another document. Although it can signify relationships such as index, next, and previous, it is most often used today to link a document to an external style sheet (see Chapter 17, "Cascading Style Sheets").

<meta>

"Meta" tags are used to provide information about a document, such as keywords or descriptions to aid search engines. It may also be used for client-pull functions. The <meta> tag is discussed later in this chapter.

<script>

JavaScript and VBScript code may be added to the document within its header using this tag.

<style>

Embedded style sheets must be added to the document header by placing the <style> element within the <head> container. For more information, see Chapter 17, "Cascading Style Sheets".

9.2.3. The Document Body

The document body, delimited by <body> tags, contains the contents of the document -- the part that displays in the browser window.

The body of an HTML document might consist of just a few paragraphs of text, a single image, or a complex combination of text, images, tables, and multimedia objects. What you put on the page is up to you.



Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.