In: Categories » » HTML XHTML and CSS » Defaulting attribute values XHTML DTDs
XML 1.0 also provides a set of tools for specifying what happens if an attribute isn't declared within an element. Four different possibilities exist, including "the attribute just isn't there"; "the attribute must be there, period"; and "the attribute has this value, period." You already have seen a few uses of these choices in the preceding declarations. In the img element, for instance, the src and alt attributes are required (#REQUIRED); meanwhile, most of the rest of its attribute content is optional (#IMPLIED):
<!--These are compatible with the XHTML DTDs but do not represent the complete declarations from the XHTML DTD--> <!ATTLIST img src CDATA #REQUIRED alt CDATA #REQUIRED height CDATA #IMPLIED width CDATA #IMPLIED id ID #IMPLIED >
The XHTML 1.0 DTDs only use fixed attributes in a very few cases, notably on the html element for its namespace declaration:
<!--This is compatible with the XHTML DTDs but does not represent the complete declarations from the XHTML DTD--> <!ATTLIST html xmlns CDATA #FIXED 'http://www.w3.org/1999/xhtml' >
This, combined with the XHTML 1.0's exhortation to always include the xmlns attribute on the html element of XHTML documents, means that only:
<html xmlns='http://www.w3.org/1999/xhtml'>...</html>
is legal, and not:
<html xmlns='http://www.example.com/1999/xhtml'>...</html>
The last option, a simple default value in quotes, appears in a few cases in which defaults are supplied easily. For example, the form element needs a method and enctype (encoding type) value and these have commonly used values.
<!--This is compatible with the XHTML DTDs but does not represent the complete declarations from the XHTML DTD--> <!ATTLIST form action CDATA #REQUIRED method (get|post) "get" enctype CDATA "application/x-www-form-urlencoded" >
The form element is useless without a place to send the information, so the action attribute is required. No default is possible because it is different for every form. On the other hand, you can default to the get HTTP method. This method then sends all data using the content-type application/xwww- form-urlencoded, making these good candidates for defaulting.
Parameter Entity Declarations Sorting out parameter entities is critical to being able to read the XHTML 1.0 and 1.1 DTDs. Parameter entities enable DTD creators to define information within a DTD that can be reused repeatedly by reference to their names. The W3C does this for several reasons – sometimes to describe the content of an attribute more precisely than XML 1.0 allows and sometimes to avoid making the same declarations over and over. This second strategy reduces the size of the DTD and makes it more manageable, while still keeping the same content. The third reason for using parameter entities is modularization. External parameter entities enable DTD creators to reference content in other files for inclusion in the DTD. In XHTML 1.0, this is used only to include the three sets of entity descriptions that are stored outside the core DTDs; but it becomes a major part of XHTML's strategy for modularizing XHTML.
Parameter Entity Declarations Sorting out parameter entities is critical to being able to read the XHTML 1.0 and 1.1 DTDs. Parameter entities enable DTD creators to define information within a DTD that can be reused repeatedly by reference to their names. The W3C does this for several reasons – sometimes to describe the content of an attribute more precisely than XML 1.0 allows and sometimes to avoid making the same declarations over and over. This second strategy reduces the size of the DTD and makes it more manageable, while still keeping the same content. The third reason for using parameter entities is modularization. External parameter entities enable DTD creators to reference content in other files for inclusion in the DTD. In XHTML 1.0, this is used only to include the three sets of entity descriptions that are stored outside the core DTDs; but it becomes a major part of XHTML's strategy for modularizing XHTML. First, let's explore internal parameter entities. They have this general syntax:
<!ENTITY % entityName "entityContent">
Entity names follow the same rules as element and attribute names: they must begin with letters, underscores, or colons and may contain letters, underscores, colons, digits, hyphens, and periods. Entity names beginning with xml (or any case variation on that, such as XMl or XML) are reserved for the use of the W3C. The Namespaces Recommendation discourages the use of colons. The content of an internal parameter entity usually is fragments of declarations, intended for use within other declarations. This content also can consist of complete declarations, but fragments that start in one declaration and end in another are prohibited. All of the internal parameter entities used in the XHTML 1.0 DTDs are fragments of declarations. The simplest ones just provide more clarification about the kind of content a particular CDATA-type attribute should include:
<!ENTITY % Number "CDATA"> <!-- one or more digits --> <!ENTITY % URI "CDATA"> <!-- a Uniform Resource Identifier, see [RFC2396] -->
When used in an attribute declaration, these entities provide some additional description to help developers figure out how to use an attribute:
<!ATTLIST pre width %Number; #IMPLIED >
Parameter entities are included by prefixing their name with a percent sign (%) and following them with a semicolon, as shown in the preceding example. In this case, a parser interprets the %Number; parameter entity to produce this declaration:
<!ATTLIST pre width CDATA #IMPLIED >
Developers reading the DTD, however, can figure out that width should be specified as a number (of characters) rather than in a string like "2 and 1/4 inches". The URI parameter entity is used similarly throughout the specification:
<!ATTLIST img src %URI; #REQUIRED longdesc %URI; #IMPLIED usemap %URI; #IMPLIED >
All of these attributes should include URIs pointing to appropriate resources. This information is intended for human consumption. The parser converts all this to:
<!ATTLIST img src CDATA #REQUIRED longdesc CDATA #IMPLIED usemap CDATA #IMPLIED >
This also may enable the W3C to update these types more easily in future versions of XML that support more data types. But for now it just documents usage. The XHTML DTD uses a similar strategy to describe some similar enumerations, such as those for shapes:
<!ENTITY % Shape "(rect|circle|poly|default)">
Instead of repeating this list of shapes, using entities allows the XHTML DTD to include more readable things like this:
<!ATTLIST area shape %Shape; "rect">
The XHTML DTDs include some parameter entities describing sets of attributes that are applied commonly. For instance, the i18n (for internationalization, which has 18 letters between the 'i' and the 'n') parameter entity is used repeatedly, assigning language and text-direction values.
<!ENTITY % LanguageCode "NMTOKEN"> <!-- a language code, as per [RFC1766] --> <!ENTITY % i18n "lang %LanguageCode; #IMPLIED xml:lang %LanguageCode; #IMPLIED dir (ltr|rtl) #IMPLIED" >
The i18n entity includes declarations for the lang, xml:lang, and dir attributes, which are ready for use within any attribute list declaration. Note that nesting parameter entities within parameter entities is perfectly acceptable – %LanguageCode; is replaced with NMTOKEN during the parsing of the DTD. The i18n entity is used like this:
<!ELEMENT title (#PCDATA)> <!ATTLIST title %i18n;> The parser expands the %i18n; to: <!ELEMENT title (#PCDATA)> <!ATTLIST title lang %LanguageCode; #IMPLIED xml:lang %LanguageCode; #IMPLIED dir (ltr|rtl) #IMPLIED > and then to: <!ELEMENT title (#PCDATA)> <!ATTLIST title lang NMTOKEN #IMPLIED xml:lang NMTOKEN #IMPLIED dir (ltr|rtl) #IMPLIED >
This produces an attribute list declaration for the title element that supports the lang, xml:lang, and dir attributes for internationalization. The W3C takes a similar approach to element content models, bundling many of them into entities for easy reference. For example, header (h1-h6) elements can appear in the same places within a document so they create a heading entity that enables you to choose among any of these attributes:
<!ENTITY % heading "h1|h2|h3|h4|h5|h6">
If an element only contains headings and text, you can create a declaration like this one:
<!ELEMENT myMixedHeadlinesElement (#PCDATA | %heading;)*>
The parser then expands this declaration to:
<!ELEMENT myMixedHeadlinesElement (#PCDATA | h1|h2|h3|h4|h5|h6)*>
This declaration enables you to mix text and heading elements. The XHTML DTD doesn't use this approach because headings are only one kind of block element and other types may appear in the same places. Instead, the heading entity is aggregated with other entities for other kinds of block elements:
<!ENTITY % block "p | %heading; | div | %lists; | %blocktext; | fieldset | table"> Then this is aggregated with even more options for different use cases: <!ENTITY % Block "(%block; | form | %misc;)*"> <!ENTITY % Flow "(#PCDATA | %block; | form | %inline; | %misc;)*"> <!ENTITY % form.content "(%block; | %misc;)*"> You then may use these content models within element declarations: <!ELEMENT div %Flow;> which expands to: <!ELEMENT div (#PCDATA | %block; | form | %inline; | %misc;)*>
which then expands to a much larger declaration as all of the parameter div element contain many different possible element types.
General Entity Declarations XHTML supports the same set of general entities that HTML 4.0 supports. Unlike parameter entities, general entities are meant for use within XHTML documents instead of the XHTML DTD. The mechanism used to create those entities works much like the parameter entity mechanism, using similar syntax – only the percent sign is missing:
<!ENTITY entityName "entityContent">
Again, entity names follow the same rules as element and attribute names: they must begin with letters, underscores, or colons and may contain letters, underscores, colons, digits, hyphens, and periods. Entity names beginning with xml (or any case variation on that, such as XMl or XML) are reserved for the use of the W3C. The Namespaces Recommendation discourages the use of colons. General and parameter entities may have the same names within a single DTD without conflict, but an entity declared as a general entity cannot be referenced as a parameter entity and vice-versa. The entity declarations used by the XHTML DTDs reference decimal values for Unicode characters, with documentation describing each entity. For example:
<!ENTITY nbsp " "> <!-- no-break space = non-breaking space, U+00A0 ISOnum -->
The W3C provides three sets of these declarations for the Latin-1 character set, symbols, and special characters. To reference any of these entities within an XHTML document, just prefix the name of the entity with an ampersand (&) and follow it with a semicolon (;). This is the same way HTML always handles entities. For example:
These words will stay on the same line.
Tip To see a complete list of the characters available in Unicode, see The Unicode Standard from the Unicode Consortium (published by Addison-Wesley). While the XML 1.0 specification references Unicode 2.0, the Unicode 3.0 specification is on the horizon and probably will replace Unicode 2.0 eventually. For a friendlier introduction to Unicode, see Unicode: A Primer, by Tony Graham (IDG Articles, 2000.) While XML 1.0 supports external parameter entities and enables you to create your own internal entity sets, HTML browsers do not support this usage. Probably only those XHTML processors that are built on validating XML processors will support these entities. For more details, see your favorite XML reference.
Tip If you build your own XML DTDs, you can include the XHTML entity sets easily. Just include a line like this:
<!ENTITY % HTMLlat1 PUBLIC "-//W3C//ENTITIES Latin 1 for XHTML//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent">
in your DTD. Each set of entities has its own declaration. Not all XML parsers retrieve external resources so make sure you use a validating parser if you employ this approach.
Comments You can use comments in DTDs pretty much as you use them in documents. Just as comments can't appear within tags in a document, they also can't appear inside of declarations in the DTD. Comments typically are positioned (before, or sometimes to the side) with the declarations they describe. Anything that appears between <!-- and --> is a comment, meant for human consumption only. Often, comments are your guides in the XHTML DTD for the "whys" of particular constructions, especially for some of the odder parts.
legal notice
Our website is not responsible for the information contained by this article. Web-articles is a free articles resource.
Suggestion: If you need fresh, daily updated content for your website, feel free to use our service. Click here for more information.
Useful tools and features
related articles
Coding Styles— XML and XHTML's Maximum Structure Overview XML parsers are far more brutal about rejecting documents they don't like than are HTML browsers. XML's clear focus on structure demands that the practices described in the previous chapter must change. However, most of those changes shouldn't cause more than minor inconveniences – at least for newly created documents. Note If reading this chapt...
2. XML and CDATA
Processing instructions XML also enables developers to pass information to the application through processing instructions (often called PIs). Processing instructions use a similar syntax to the XML declaration, although the rules for them are much less strict. Processing instructions begin with <? and end with ?>, but the developer generally dictates their contents. The first bit of text before a space appears in a PI is called the target. The target must start with a letter, unde...
3. lang Internationalization
Internationalization: xml:lang and lang Internationalization (often abbreviated i18n because 18 characters appear between the i and the n) gets a significant boost with the shift to XML primarily because of XML's use of Unicode as the underlying character model. While not every document needs to encode Chinese, Cyrillic, Arabic, and Indian characters, Unicode makes it possible for all of these forms to exist within a single document. In addition, XML and XHTML allow for the possibility of other e...
4. Anatomy of an XHTML Document
The transition from HTML to XHTML will come with a fair number of bumps. While later chapters introduce tools to help you get past those bumps – and figure out where they come from – this chapter examines what's going to change and demonstrates a few strategies for handling those changes. Along the way, we visit the ghosts of browsers past and explore problems that exist in current browsers. In turn, you discover how prepared and unprepared various tools are for XHTML. Note Som...
5. Converting to strict HTML and XHTML
Converting to strict HTML You start out by declaring your intentions to use the strict HTML 4.01 DTD by putting the appropriate DOCTYPE declaration at the head of the document: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> Now the first section of the document, including the HTML opening tag and the HEAD element and its contents, is fine except for one line. The SCRIPT element no longer supports a LANGUAGE at...
6. Reading the XHTML DTDs A Guide to XML Declarations
Reading the XHTML DTDs: A Guide to XML Declarations Although the W3C has long had document type definitions (DTDs) for HTML, few developers actually use those DTDs as a foundation for learning HTML. XHTML 1.0 simplifies those DTDs with the slightly friendlier XML syntax – they previously used SGML's more complex syntax – and the increased emphasis on validation may lead developers to explore them more closely. Making good use of XHTML 1.1 requires some level of ...
7. Exploring the XHTML DTDs
Exploring the XHTML DTDs Choosing Your DTD XHTML 1.0 provides three DTDs that describe different sets of XHTML elements and reflect the three choices provided in HTML 4.0: strict, transitional, and frameset. The probably the one that the W3C would like to see developers adhere to, but transitional DTDs reflect the reality of HTML usage much more accurately. Appendix A lists the in the three different DTDs, along with notes regarding attributes. To identify the DTD for a ...
8. Building XHTML DTD Structure Element and Attribute Declarations
Building Structure: Element and Attribute Declarations After all of these preliminaries, it's finally time to make some real declarations, creating the elements and attributes partly described by the entities established so far. This portion of the DTD is broken down into segments that reflect groupings of element types, foreshadowing to some extent the modularization process that XHTML 1.1 will perform. If you have trouble getting your XHTML documents to validate, you need to explore this portion of the ...
