In: Categories » » HTML XHTML and CSS » Strategies for Managing XHTML Generation Code
Generating XHTML is a more demanding process than generating HTML, if only because XHTML comes with a much stricter set of rules. Meeting these demands doesn't have to mean hours of bughunting every time you build a complex program, but it may mean that you have to modify the way you write your programs. (No requirement forces you to change, but adhering to these rules may prove easier in the long run.) Pretty much all of the techniques that work with HTML work with XHTML, but you may want to incorporate more of an XHTML-orientation into your generation code choices.
Text Working with HTML and XHTML documents as text is, in some ways, the easiest approach. In other ways, it is definitely the hardest approach. Text is the foundation of markup documents. Working at that level can be straightforward, but it also denies you the privilege of working with information at a higher level (such as the container structures created by XHTML). Writing code that generates raw text – which just happens to be XHTML – requires a lot of attention to detail, especially as XHMTL is much less forgiving of errors. Text-generation strategies may be useful, especially for projects that need to create more than one version of a document. Fundamentally, every environment that generates XHTML generates text. It's just a matter of what kinds of abstractions are in use. Probably the easiest way to update textgenerating code for the new challenges of XHTML is to add some of those layers of abstraction, separating code that generates markup from code that address content. As the abstraction proceeds, you then can add extra logic that ensures that markup is properly balanced or conforms to a required structure. Most programmers already do this to some extent so they can reuse code; in essence, it may just be a matter of refocusing existing work.
Templates Template systems, such as Active Server Pages (ASP) and Java Server Pages (JSP), enable developers to mix logic for creating content and structure with general templates that provide an overall framework. In some ways, these approaches are much like the text-generating systems described previously – but they have both advantages and disadvantages over that straightforward approach. Templates typically are easier to read and modify, especially for cases in which the generated content is a small portion of the document. At the same time, however, the interaction between the generated code and the information already stored in the template can cause problems that look like they are in the code but are in fact in the template and vice-versa. There are a few environments in which using XHTML can be difficult because of conflicts between XHTML syntax and the syntax of the development environment. If you use PHP scripting to generate XHTML documents, you may encounter a problem. Including the XML declaration (<?xml?>) throws off the PHP processor. Because it can rely on the <? as its placemark for where to begin processing, you have two options in authoring your XHTML. The first option is to exclude the XML declaration completely. It' s not required in an XHTML document, so this isn't a problem. Having it in the first place is just a good markup habit. The other option is to always use <?php as its placemark to begin parsing. That way, <?xml?> can't throw it off. Disabling the 'short open tag' setting may require coordination with your Web site hosting company if you don't have administrative control over your server.
Note Although Extensible Stylesheet Language Transformations (XSLT) are templatebased document generators, the rules they follow are much stricter than those used by the technologies described here. The XML 1.0 specification already faced similar issues with general entities, which enabled developers to include content (including markup) by reference. The solution XML 1.0 enforces is a requirement that all general entities that contain markup must be well formed. If an entity includes a start tag for an element, it must include an end tag for that element. All of the structures inside of a general entity must be nested and marked up properly. You can't use general entities to specify parts of markup, such as half a start tag or just an end tag. Taking a similar approach to code generation can solve most of the problems caused by unexpected interactions between the template and the generated content, and should make it easier to track down the origin of such problems when they do occur. The strategies suggested for text-generating code also apply in large part to template-based XHTML generation. Creating layers of abstraction that go beyond creating streams of characters can help make the code portion of these template-based systems easier to work with, and may make it more reusable across documents and projects.
Caution While template-based systems can produce XHTML, the templates themselves frequently are not XHTML (or even XML) because of their use of constructions such as <%. Among other things, this may force you to store templates separately from XHTML documents if you use an XML-based document management facility. The XML-Apache project is building a template language called XML Server Pages (XSP) that does use XML documents for their templates, but they are well ahead of most template systems in their zeal for well formed templates. See http://xml.apache.org/cocoon/wd-xsp.html for a draft of XSP.
Modularization In general, the most thorough long-term approaches to making XHTML generation clean and maintainable involve creating code modules that do simple things reliably and then connecting these modules to create documents. Reliability is perhaps the most important change moving from HTML to XHTML development, and that reliability is of a somewhat different type. In the HTML world, the code had to produce content that looked consistent in a given browser or browsers; in the XHTML world, the code has to produce content that is structurally – as well as visually – consistent. While an occasional missing end-paragraph tag doesn't cause problems in an HTML browser, it can bring a halt to XHTML processing. Breaking down the larger problem of building a document into the smaller problems of creating particular structures is one way to make sure that the small problems are solved consistently. It also enhances reusability and makes it easier to update the small problem solutions without interfering with the overall logic of the document. Several HTML generation systems – such as CGI.pm (the CGI module for Perl) and the Java Servlet Library – already use modules that generate markup based on arguments passed to them through function calls. When developers rely on these modules exclusively, rather than mixing them with explicit text-generation code, then updating a system to use XHTML is easy. You just update the module system to an XHTML-compliant version.
Note Module systems that generate XHTML are starting to appear — notably a new version of CGI.pm — but it may be a while before these generic systems consistently produce XHTML instead of HTML. If it isn't clear from the documentation, you may want to contact the developer maintaining the markup generation system you are using. In addition to containing side effects, adding modularity to your code should help you future-proof it to some extent. XHTML 1.0 marks the first major structural change to HTML since its inception, and developers thus far have been able to rely on older code working just fine in newer browsers. While XHTML 1.0 may be the first change to break that understanding, it certainly will not be the last. XHTML 1.1 won't instantly break XHTML 1.0 processing, but it adds new functionality that may require substantial change to both document-generation code on the server and document-processing code on the client. By modularizing your code, you position yourself to take advantage of the new possibilities XHTML 1.1 will create for extending the HTML vocabulary. XHTML 2.0 is also on the horizon, although probably much further out. XHTML 2.0 may involve significant destruction and reconstruction of some parts of the HTML vocabulary, including linking functionality and other processing that involves external resources. These various kinds of future-proofing may require a different mindset than the one that has proven so successful at creating large numbers of HTML applications at low cost. Despite the potential of higher development costs per module, however, this new mindset promises long-term upgradability and a much easier task for programmers who need to manage and reuse code over the long term.
legal notice
Our website is not responsible for the information contained by this article. Web-articles is a free articles resource.
Suggestion: If you need fresh, daily updated content for your website, feel free to use our service. Click here for more information.
Useful tools and features
If you like this article (tutorial), please link to it from your web page using the information above.
related articles
Internationalization: xml:lang and lang Internationalization (often abbreviated i18n because 18 characters appear between the i and the n) gets a significant boost with the shift to XML primarily because of XML's use of Unicode as the underlying character model. While not every document needs to encode Chinese, Cyrillic, Arabic, and Indian characters, Unicode makes it possible for all of these forms to exist within a single document. In addition, XML and XHTML allow for the possibility of other e...
2. Anatomy of an XHTML Document
The transition from HTML to XHTML will come with a fair number of bumps. While later chapters introduce tools to help you get past those bumps – and figure out where they come from – this chapter examines what's going to change and demonstrates a few strategies for handling those changes. Along the way, we visit the ghosts of browsers past and explore problems that exist in current browsers. In turn, you discover how prepared and unprepared various tools are for XHTML. Note Som...
3. Converting to strict HTML and XHTML
Converting to strict HTML You start out by declaring your intentions to use the strict HTML 4.01 DTD by putting the appropriate DOCTYPE declaration at the head of the document: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> Now the first section of the document, including the HTML opening tag and the HEAD element and its contents, is fine except for one line. The SCRIPT element no longer supports a LANGUAGE at...
4. Reading the XHTML DTDs A Guide to XML Declarations
Reading the XHTML DTDs: A Guide to XML Declarations Although the W3C has long had document type definitions (DTDs) for HTML, few developers actually use those DTDs as a foundation for learning HTML. XHTML 1.0 simplifies those DTDs with the slightly friendlier XML syntax – they previously used SGML's more complex syntax – and the increased emphasis on validation may lead developers to explore them more closely. Making good use of XHTML 1.1 requires some level of ...
XML 1.0 also provides a set of tools for specifying what happens if an attribute isn't declared within an element. Four different possibilities exist, including "the attribute just isn't there"; "the attribute must be there, period"; and "the attribute has this value, period." You already have seen a few uses of these choices in the preceding declarations. In the img element, for instance, the src and alt attributes are required (#REQUIRED); meanwhile, most of the rest of its attribute content is optio...
6. Exploring the XHTML DTDs
Exploring the XHTML DTDs Choosing Your DTD XHTML 1.0 provides three DTDs that describe different sets of XHTML elements and reflect the three choices provided in HTML 4.0: strict, transitional, and frameset. The probably the one that the W3C would like to see developers adhere to, but transitional DTDs reflect the reality of HTML usage much more accurately. Appendix A lists the in the three different DTDs, along with notes regarding attributes. To identify the DTD for a ...
7. Building XHTML DTD Structure Element and Attribute Declarations
Building Structure: Element and Attribute Declarations After all of these preliminaries, it's finally time to make some real declarations, creating the elements and attributes partly described by the entities established so far. This portion of the DTD is broken down into segments that reflect groupings of element types, foreshadowing to some extent the modularization process that XHTML 1.1 will perform. If you have trouble getting your XHTML documents to validate, you need to explore this portion of the ...
8. Style Sheets and XHTML
Cascading Style Sheets (CSS) is an enormously powerful tool that has been slow to catch on in the HTML development world. Whether or not you use (or like) CSS, the continuing evolution of CSS is deeply intertwined with the work moving forward on XHTML so learning about CSS can help you understand XHTML as well as implement it. Fortunately, CSS isn't very difficult once you master a few key structures and learn to apply its vocabulary. There are some real problems with existing CSS implementations that I cover later...
9. Formatting Content with CSS Properties
While selectors do a great job of picking out content that needs formatting, designers (as opposed to Web site managers) like CSS mostly because of the large number of available formatting properties. CSS offers properties that support nearly any presentation of a document desired, and yet more properties are in development as part of the CSS3 activity. CSS properties enable you to describe precisely how you want the pieces of your document formatted and to override the rules by which HTML is presented normally. <...
10. Using XHTML in Traditional HTML Applications
Before moving into the much more complicated terrain of converting older HTML content to the newer XHTML rules, let's take a look at how the shift to XHTML affects day-to-day Web development and the construction of new content. Web development has been in nearly constant flux since its beginnings, and developers are accustomed to (if perhaps tired of) the challenges that come with every new standard and every new browser. Some of the challenges XHTML presents are familiar, although a few new twists brought on by XH...