In: Categories » » HTML XHTML and CSS » Fragmentation of XHTML
Fragmenting XHTML Fragmentation appears to be a cure to the many ills of the Web, pointing the way forward to new XML vocabularies and new possibilities. The concrete implementation details of XHTML 1.1, however, look rather scary. Contained in three drafts with a total of about 180 pages, the XHTML 1.1 specs are a daunting collection of rules (and the application of those rules) that applies to the XHTML vocabulary. Fortunately, while the rules make use of XML's funkier tools, the way they actually work isn't very painful and developers may be able to avoid the frightening details.
Note The content of this article is based on the 5 January 2000 Last Call Working Drafts of the XHTML 1.1 specifications. Some content may change between the time of this writing and the final approval of the specifications by the W3C, so you should check to find out the current or final status of these issues.
XHTML as Framework Unlike its predecessors, XHTML 1.1 provides an architectural framework for syntax rather than a simple concrete implementation. XHTML 1.1's architecture for defining modules is effectively a layer on top of XML 1.0's rules for creating DTDs, and its own implementation of the XHTML vocabulary is a layer on top of that one. To simplify all of these layers and their interactions, XHTML 1.1 has these three separate documents defining it:
- Building XHTML Modules (http://www.w3.org/TR/xhtml-building) provides the formal framework on which the XHTML modules (and other modules) are built.
- Modularization of XHTML (http://www.w3.org/TR/xhtml-modularization) describes how you implement XHTML 1.1 using that framework.
- XHTML 1.1 - Module-based XHTML (http://www.w3.org/TR/xhtml11) describes how you create XHTML 1.1 documents using these modules.
In a sense, XHTML is two separate parts defined in three specs. The first part is the framework – how to create the modules (defined in Building XHTML Modules) and how to reassemble them as documents (defined in XHTML 1.1 - Module-based XHTML). The second part is the implementation that Modularization of XHTML – and to some extent XHTML 1.1 - Module-based XHTML – defines. This article walks through the framework on the way to the implementation explanation, breaking down each component of XHTML while staying within its general bounds. The framework combines a set of rules for creating modules and different kinds of descriptions of those modules, as well as a set of rules for integrating those modules to create a larger whole. The process of breaking XHTML into modules uses the former set of tools, while documents that use just XHTML rely on the latter.
Abstract Modules XHTML prescribes both formal and informal ways to describe modules. Abstract modules are documents intended purely for human consumption, helping readers avoid the tangle of parameter entity processing needed by the formal tools for describing modules. This level of description is useful both for documentation and planning, forcing developers to specify what their modules contain in a format that goes beyond the prickly formal tools of DTDs and XML Schemas. Abstract modules are not required for conformance to the XHTML 1.1 specifications, but their use can make creating and using XHTML 1.1 modules much easier.
Note Abstract modules are defined in Section 4 of Building XHTML Modules, available at http://www.w3.org/TR/xhtmlbuilding/ abstraction.html#s_abstraction. Abstract modules are basically tables with some supporting textual content. The tables consist of lists of elements with columns for attributes and minimal content models. Because some elements may be defined with content sets, such sets may be described in ways that aren't explicitly included in the table. Content sets are typically used repeatedly in multiple elements, so this special treatment probably makes sense. No such provision is made for sets of attributes, however. (One exception: using Common as an identifier for a core set of attributes in the XHTML 1.1 DTDs.) Within those tables, XHTML uses a semiformal syntax that looks like an extended (and reduced) version of XML DTD syntax. This describes the element type textElement, which uses the XHTML Common set of attribute declarations (defined at http://www.w3.org/TR/xhtmlmodularization/ xhtml_modules.html#s_basicattributes) and contains only text. Most modules undoubtedly are more complex than this one, but sometimes only a single element is needed to add functionality. Before you move on to a more complicated example, you should note some of the pieces that are missing from the abstract module descriptions created in accordance with the Building XHTML Modules draft. No information is provided about namespaces. This is reasonable when working strictly within XHTML where all the parts share a common namespace. While the prefixes may appear in the element names, the URIs they map to need to be documented somewhere. Also missing is an explanation of how you should integrate this module with other modules. It isn't clear how to use this module and its components appropriately within an XHTML framework. This kind of documentation should form an important supplement to the abstract module framework described in the specification itself. Keeping those warnings in mind, take a look at one of the abstract modules defined in Modularization of XHTML to see how these tools are used (see sidebar). The Forms Module is fairly complex, but familiar to most HTML developers, and it contains a variety of content models. While its actual content may change on the path to becoming a W3C recommendation, it has some excellent examples of the abstract module syntax in action and shows how additional textual content can fill in the gaps of an abstract module. Let's start with the module in the sidebar (from Section 4.5.2), and then explore its pieces. The Forms Module provides all of the forms features found in HTML 4.0. Specifically, the Forms This module defines two content sets:
Form form | fieldset
Formctrl input | select | textarea | label | button
When this module is used, it adds the Form content set to the Block content set and it adds the Formctrl content set to the Inline content set as these are defined in the Basic Text Module. The Forms Module is a superset of the Basic Fjorms Module. These modules may not be used together in a single document type. Let's start by examining the attributes of the Form element type. The first entry, Common, refers to the set of attributes described earlier. In the HTML and PDF versions of the Modularization of XHTML draft, this information is provided through cross-references and in Section 4.1.3. The method attribute, which only accepts two values, indicates this through the use of the vertical bar:
method ("get" | "put")
Another technique employed here that isn't documented in the Building XML Modules draft is the use of quotes around the possible values for the attribute. This isn't done inside of XML 1.0 DTDs, but it is necessary here to differentiate these attribute values from the names used for attribute types. The Minimal Content Model for the form element type uses several content sets and some of the tools described in the preceding sidebar table:
(Heading | Block - form | fieldset)+
The content model for the form element type may include elements from the Heading content set (defined in Section 4.2.2, the Basic Text Module, along with Block, Inline, and Flow). Elements from the Block content set are also welcome, with the exception of the form element itself. The fieldset element, defined within this module, also may appear inside the form element type. Because the final character is a + rather than a *, at least one child element from this range of choices must appear. The input element type is notable for its use of the EMPTY content type, while most of the other elements in this module allow text (PCDATA) along with other choices.
The two content sets defined here, Form and Formset, are used inside of the module only to exclude their content (using the - indicator) rather than to add them to content models. At the same time, however, the notes at the bottom make clear that the module adds these content sets to the Block and Inline content sets defined in Section 4.2.2. As with XML 1.0, you should treat the names used for content sets and elements as case-sensitive: the form element type and the Form content set are not the same thing. Another important aspect to consider in the notes at the bottom of the module is the potential for conflict with the Basic Forms Module, which defines (in Section 4.5.1) a subset of the larger Forms module. Including both can cause validation problems as the declarations conflict. Reading the fine print can keep you out of trouble when you put together the W3C's own modules, and you should make sure to include such documentation in your own abstract modules as well.
legal notice
Our website is not responsible for the information contained by this article. Web-articles is a free articles resource.
Suggestion: If you need fresh, daily updated content for your website, feel free to use our service. Click here for more information.
Useful tools and features
related articles
Overview Shifting from HTML to XHTML requires a significant change in mindset from the design-oriented freefor- all that characterized the early years of the Web. This change in style reflects movement in the underlying architecture toward a more powerful and more controllable approach to document creation, presentation, and management. Understanding the connections between the architectural and stylistic changes may help you find more immediate benefits from XHTML –...
2. Coding Styles HTMLs Maximum Flexibility
The XHTML 1.0 specification provides a set of rules for XHTML (User Agent Conformance) that includes a rough description of how XHTML software differs from HTML software, though these rules exist mostly to bring XHTML rendering practice in line with the rules for parsing XML 1.0. XHTML also is designed to remain compatible (mostly) with the previous generation of HTML applications, so it may take a while for the transition to occur. Pure XHTML user agents (also known as XHTML processing software) aren't l...
3. XML and XHTMLs Maximum Structure
Coding Styles— XML and XHTML's Maximum Structure Overview XML parsers are far more brutal about rejecting documents they don't like than are HTML browsers. XML's clear focus on structure demands that the practices described in the previous chapter must change. However, most of those changes shouldn't cause more than minor inconveniences – at least for newly created documents. Note If reading this chapt...
4. XML and CDATA
Processing instructions XML also enables developers to pass information to the application through processing instructions (often called PIs). Processing instructions use a similar syntax to the XML declaration, although the rules for them are much less strict. Processing instructions begin with <? and end with ?>, but the developer generally dictates their contents. The first bit of text before a space appears in a PI is called the target. The target must start with a letter, unde...
5. lang Internationalization
Internationalization: xml:lang and lang Internationalization (often abbreviated i18n because 18 characters appear between the i and the n) gets a significant boost with the shift to XML primarily because of XML's use of Unicode as the underlying character model. While not every document needs to encode Chinese, Cyrillic, Arabic, and Indian characters, Unicode makes it possible for all of these forms to exist within a single document. In addition, XML and XHTML allow for the possibility of other e...
6. Anatomy of an XHTML Document
The transition from HTML to XHTML will come with a fair number of bumps. While later chapters introduce tools to help you get past those bumps – and figure out where they come from – this chapter examines what's going to change and demonstrates a few strategies for handling those changes. Along the way, we visit the ghosts of browsers past and explore problems that exist in current browsers. In turn, you discover how prepared and unprepared various tools are for XHTML. Note Som...
7. Converting to strict HTML and XHTML
Converting to strict HTML You start out by declaring your intentions to use the strict HTML 4.01 DTD by putting the appropriate DOCTYPE declaration at the head of the document: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> Now the first section of the document, including the HTML opening tag and the HEAD element and its contents, is fine except for one line. The SCRIPT element no longer supports a LANGUAGE at...
8. Reading the XHTML DTDs A Guide to XML Declarations
Reading the XHTML DTDs: A Guide to XML Declarations Although the W3C has long had document type definitions (DTDs) for HTML, few developers actually use those DTDs as a foundation for learning HTML. XHTML 1.0 simplifies those DTDs with the slightly friendlier XML syntax – they previously used SGML's more complex syntax – and the increased emphasis on validation may lead developers to explore them more closely. Making good use of XHTML 1.1 requires some level of ...
