XML DTD Modules Part One

an article added by: Albert Lichtblau at 06022007


HTML XHTML and CSS :: XML DTD Modules Part One ::

 French | Spanish | Portuguese | Italian | German | Japanese | Chinese | Korean | Russian | Arabic Bookmark and Share

XML DTD Modules DTD modules are better defined than abstract modules, although not quite as flexible. structures. Parameterization is extremely powerful, but it does take some getting used to.

Tip XHTML 1.1 DTD modules are a lot harder to read than many XML DTDs. If you can't penetrate the formal description of a given module, the abstract module should help you. If you write your own modules, it is critical that you include abstract modules.

XREF The rules for creating XHTML 1.1's XML DTD modules are presented in Section 5 of the Building XHTML Modules draft and demonstrated in Section 6. There are a few additional conventions used in Modularization of XHTML that Building XHTML Modules doesn't describe, which I cover here as well. They appear useful and help explain some of the syntactical shortcuts (such as the Common attributes) used in abstract modules. Parameterization just means putting all the contents of declarations into param-eter entities. This makes the declarations easier to manage, and at the same time makes it much easier to modify them. While you can modify attribute declarations and parameter entities by making the declaration again, XML prohibits multiple declarations for element types. By putting the contents of those declarations into parameter entities, the creators of XHTML modules can provide a lot more flexibility. Let's look at examples of each of these suffixes taken from the W3C draft DTD, building from the smallest atomic pieces to the largest.

.datatype The data types in XHTML 1.1 are direct descendants of those in XHTML 1.0, and they are declared in Section B.2.1. Most of the data types are simply more precise names for CDATA, textual content:

   <!-- a Uniform Resource  Identifier, see [URI] -->
 <!ENTITY % URI.datatype  "CDATA" >

These data types are then used in attribute declarations:

   <!ATTLIST a
   %Common.attrib;
   href %URI.datatype; #IMPLIED
   charset %Charset.datatype; #IMPLIED
   type %ContentType.datatype; #IMPLIED
   hreflang %LanguageCode.datatype;  #IMPLIED
   rel %LinkTypes.datatype; #IMPLIED
   rev %LinkTypes.datatype; #IMPLIED
   accesskey %Character.datatype;  #IMPLIED
   tabindex %Number.datatype; #IMPLIED
   > 

All of these data type declarations actually resolve to CDATA when an XML processor reads the DTD, but they make the content that should be stored in these attributes much more identifiable.

Tip

While XML 1.0 processors can't do much to enforce data typing today, schema processors should be capable of accomplishing more with this information in the future. Think of this approach as adding information to the DTD so it's ready for the next version. These data type names are used in the abstract modules for XHTML 1.1 as well, supplementing the core XML 1.0 set of types.

.attrib The .attrib suffix is used on parameter entities that represent one or more attribute specifications – the part of an attribute list declaration that defines individual attributes, their types, defaults, and possible values. These entities sometimes describe only one attribute, like this one for the id attribute:

   <!ENTITY % Id.attrib
   "id ID
   #IMPLIED"
   > 
   They may specify multiple  attributes, like this one for xml:lang and dir:
   <!ENTITY % I18n.attrib
   "xml:lang  %LanguageCode.datatype; #IMPLIED
   dir ( ltr | rtl ) #IMPLIED"
 > 

These entities also may include other entities with the .attrib suffix, as in the ubiquitous Common.attrib entity:

   <!ENTITY % Common.attrib
   "%Core.attrib;
   %I18n.attrib;
   %Events.attrib;"
   > 

This just includes all of the attribute specifications declared in the Core.attrib, I18n.attrib, and Events.attrib entities, building a large list of common components. The quotes need to be used even though all of the contents of the entity are contained in parameter entities.

.attlist The .attlist suffix (not documented in Building XHTML Modules) is used in the XHTML 1.1 DTDs to turn ATTLIST declarations on and off. Parameter entities that have the .attlist suffix take one of two values: INCLUDE or IGNORE. These function with a feature of XML 1.0 DTDs not used in XHTML 1.0: conditional sections.

Tip For a much more detailed explanation of conditional sections and their use in other XML contexts, see Article 16 of XML Elements of Style by Simon St. Laurent (McGraw-Hill, 2000). Conditional sections may appear in DTDs only; they enable DTD designers to turn sets of declarations on and off. By using parameter entities to determine whether to include or ignore a section, developers make it possible to use portions of a DTD or even choose among different variations on a single DTD. For example, this DTD fragment includes the attributes for the title element type:

   <!ENTITY % Title.attlist  "INCLUDE" >
   <![%Title.attlist;[
   <!ATTLIST title
   %I18n.attrib;
   > 
 <!-- end of Title.attlist  -->]]>

The first line creates a parameter entity named Title.attlist whose value is INCLUDE. In the next line, the entity is substituted with %Title.attlist; to produce these resulting declarations:

   <![INCLUDE[
   <!ATTLIST title
   %I18n.attrib;
   > 
 <!-- end of Title.attlist  -->]]>
An XML parser strips out the INCLUDE section and the comment, leaving a core of:
 <!ATTLIST title
 %I18n.attrib;
 > 
Which then becomes:
 <!ATTLIST title
 xml:lang %LanguageCode.datatype;  #IMPLIED
 dir ( ltr | rtl ) #IMPLIED
 > 
and finally:
 <!ATTLIST title
 xml:lang NMTOKEN #IMPLIED
 dir ( ltr | rtl ) #IMPLIED
 > 
If, on the other hand, another module redeclares the Title.attlist entity to be IGNORE:
 <!ENTITY % Title.attlist  "IGNORE" >
then the result is:
 <![IGNORE[
 <!ATTLIST title
 %I18n.attrib;
 > 
 <!-- end of Title.attlist  -->]]>
which prohibits the parser from processing the declarations at all, leaving title with no attributes. Entities with the .attlist suffix surround the attribute list declarations for every element type in the Modularization of XHTML draft.
  

.content The .content suffix functions for parameter entities that describe content models for particular element types. The simplest example, for an EMPTY content model, looks like this:

   <!ENTITY % Input.content  "EMPTY" >
 <!ELEMENT input %Input.content;  >

When processed, this resolves to:

   <!ELEMENT input EMPTY >

and defines the input element as having an empty content model. By redeclaring entities with a .content suffix, other modules easily can modify the content model of an element.

.class (and .extra) The .class suffix functions for parameter entities that may be used repeatedly in content models for multiple elements, but only when the contents are element type names that all share something in common. In XHTML, this tends to mean that block elements are one class, while inline elements are another class. These entities aren't defined (with one exception, noted next) in the Modularization of XHTML draft. They are defined in the customization file, another module, in Appendix C of XHTML 1.1 - Module-based XHTML. For example:

 <!ENTITY % Inlstruct.class  "br | span" >

Through the abbreviations, you can see that these are structural element types that may appear as inline elements. br is used for line breaks within block elements, while span is an abstract element mostly useful for marking off inline content in ways that aren't reflected by other inline content. This entity and several of its siblings get combined into a larger Inline.class entity:

   <!ENTITY % Inline.class
   "%Inlstruct.class;
   %Inlphras.class;
   %Inlpres.class;
   %I18n.class;
   %Anchor.class;
   %Inlspecial.class;
   %Ruby.class;
   %Inline.extra;"
 > 

One oddity here is Inline.extra – Building XHTML Modules describes no "official" convention for .extra. Inline.extra has this declaration:

   <!ENTITY % Inline.extra
 "| input | select | textarea |  label | button" >

The DTD comments describe how to use this .extra suffix: While in some cases this module may need to be rewritten to accommodate changes to the document model, minor extensions may be accomplished by redeclaring any of the three *.extra; parameter entities to contain extension element types as follows:

%Misc.extra; whose parent may be any block or inline element. %Inline.extra; whose parent may be any inline element. %Block.extra; whose parent may be any block element. If used, these parameter entities must be an OR-separated list beginning with an OR separator ("|"), eg., "| a | b | c" While .extra is undocumented (so far) in Building XHTML Modules, it is a critical piece for developers who want to add their own extensions to XHTML 1.1. The .class suffix also functions in at least one place for attributes. The following entity includes all of the input types:

   <!ENTITY % InputType.class
   "( text | password | checkbox |  radio | submit
   | reset | file | hidden | image  )"
 > 

This is then used in an attribute declaration:

   <!ATTLIST input
   %Common.attrib;
 type %InputType.class; 'text'

This anomaly probably derives from the input element's unusual use of an attribute to signify its "real" content.

legal disclaimer

Our website is not responsible for the information contained by this article. Web-articles is a free articles resource.
Suggestion: If you need fresh, daily updated content for your website, feel free to use our service. Click here for more information.

related articles

1. Coding Styles HTMLs Maximum Flexibility
The XHTML 1.0 specification provides a set of rules for XHTML (User Agent Conformance) that includes a rough description of how XHTML software differs from HTML software, though these rules exist mostly to bring XHTML rendering practice in line with the rules for parsing XML 1.0. XHTML also is designed to remain compatible (mostly) with the previous generation of HTML applications, so it may take a while for the transition to occur. Pure XHTML user agents (also known as XHTML processing software) aren't l...

2. XML and XHTMLs Maximum Structure
Coding Styles— XML and XHTML's Maximum Structure Overview XML parsers are far more brutal about rejecting documents they don't like than are HTML browsers. XML's clear focus on structure demands that the practices described in the previous chapter must change. However, most of those changes shouldn't cause more than minor inconveniences – at least for newly created documents. Note If reading this chapt...

3. XML and CDATA
Processing instructions XML also enables developers to pass information to the application through processing instructions (often called PIs). Processing instructions use a similar syntax to the XML declaration, although the rules for them are much less strict. Processing instructions begin with <? and end with ?>, but the developer generally dictates their contents. The first bit of text before a space appears in a PI is called the target. The target must start with a letter, unde...

4. lang Internationalization
Internationalization: xml:lang and lang Internationalization (often abbreviated i18n because 18 characters appear between the i and the n) gets a significant boost with the shift to XML primarily because of XML's use of Unicode as the underlying character model. While not every document needs to encode Chinese, Cyrillic, Arabic, and Indian characters, Unicode makes it possible for all of these forms to exist within a single document. In addition, XML and XHTML allow for the possibility of other e...

5. Anatomy of an XHTML Document
The transition from HTML to XHTML will come with a fair number of bumps. While later chapters introduce tools to help you get past those bumps – and figure out where they come from – this chapter examines what's going to change and demonstrates a few strategies for handling those changes. Along the way, we visit the ghosts of browsers past and explore problems that exist in current browsers. In turn, you discover how prepared and unprepared various tools are for XHTML. Note Som...

6. Converting to strict HTML and XHTML
Converting to strict HTML You start out by declaring your intentions to use the strict HTML 4.01 DTD by putting the appropriate DOCTYPE declaration at the head of the document: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> Now the first section of the document, including the HTML opening tag and the HEAD element and its contents, is fine except for one line. The SCRIPT element no longer supports a LANGUAGE at...

7. Reading the XHTML DTDs A Guide to XML Declarations
Reading the XHTML DTDs: A Guide to XML Declarations Although the W3C has long had document type definitions (DTDs) for HTML, few developers actually use those DTDs as a foundation for learning HTML. XHTML 1.0 simplifies those DTDs with the slightly friendlier XML syntax – they previously used SGML's more complex syntax – and the increased emphasis on validation may lead developers to explore them more closely. Making good use of XHTML 1.1 requires some level of ...

8. Defaulting attribute values XHTML DTDs
XML 1.0 also provides a set of tools for specifying what happens if an attribute isn't declared within an element. Four different possibilities exist, including "the attribute just isn't there"; "the attribute must be there, period"; and "the attribute has this value, period." You already have seen a few uses of these choices in the preceding declarations. In the img element, for instance, the src and alt attributes are required (#REQUIRED); meanwhile, most of the rest of its attribute content is optio...

9. Exploring the XHTML DTDs
Exploring the XHTML DTDs Choosing Your DTD XHTML 1.0 provides three DTDs that describe different sets of XHTML elements and reflect the three choices provided in HTML 4.0: strict, transitional, and frameset. The probably the one that the W3C would like to see developers adhere to, but transitional DTDs reflect the reality of HTML usage much more accurately. Appendix A lists the in the three different DTDs, along with notes regarding attributes. To identify the DTD for a ...

10. Building XHTML DTD Structure Element and Attribute Declarations
Building Structure: Element and Attribute Declarations After all of these preliminaries, it's finally time to make some real declarations, creating the elements and attributes partly described by the entities established so far. This portion of the DTD is broken down into segments that reflect groupings of element types, foreshadowing to some extent the modularization process that XHTML 1.1 will perform. If you have trouble getting your XHTML documents to validate, you need to explore this portion of the ...