In: Categories » » HTML XHTML and CSS » Integrating the DOM with XHTML Generation
XSLT's applicability to traditional Web applications is limited by its demand for an XML source document. If your organization hasn't deployed XML already, this may be of limited use as far as XHTML production is concerned. Despite that limitation, however, you may find XSLT useful for managing and modifying your XHTML content. XHTML is after XML, and therefore ripe for processing with XSLT, if you find it appropriate.
Perhaps ironically, documents created with cascading style sheets in mind, which make great use of the class attribute, are almost as easy to work with in XSLT as are XML documents with semantic element names. Because XSLT enables you to specify rules that are dependent on attribute values and names as well as elements, you can "harvest" the semantic content (if any) of your existing XHTML documents. At the same time, you may be able to convert them to other XML formats – from your own vocabularies to more generic vocabularies such as XSL Formatting Objects, Scalable Vector Graphics (SVG), or Synchronized Multimedia Integration Language (SMIL). The more markup information your documents contain, and the more regularly it is applied, the better your chances of applying XSLT to such work. Even if you don't have plans to do this, and you generate all your Web documents from databases, you may be able to put XSLT to work as a layer of abstraction between your final documents and the data sources that populate them. If you work with multiple databases, and especially if they are distributed widely, you may find it useful to have the databases or some kind of middleware send you their information as sets of XML documents. They can accomplish this either in reply to queries, in advance, or both in some of kind caching approach. Then you can use XSLT to knit the results together into a final format.
Although HTML developers have used the Document Object Model (DOM) on the client in some form since about 1997 when the earliest dynamic HTML implementations appeared, its use on the server opens up new horizons in document generation. At the same time, the DOM makes it very easy to create conformant XHTML. Using the DOM to generate documents may not be appropriate in every situation: It takes a very different approach than text generation or templates, and may require retooling as well as rethinking the programming model. For cases in which it does fit, however, the DOM promises to make XHTML conformance much easier while making program structures easier to design formally.
Building Trees, Not Streams Streams of text require few resources and you can generate them relatively efficiently. Conversely, trees require more resources and they sometimes cost more processing power. However, they open up new possibilities for developers who need to push the envelope. Tree models provide two main advantages to developers building XHTML applications. First, they offer a large degree of modularity to insulate developers from the bugs caused by mistaken textual outputs. Second, they provide a much greater degree of flexibility that enables developers to create an initial tree and then modify it as necessary – perhaps even transforming it into a different structure altogether or reducing it to a small fragment. If you need this kind of reliability and flexibility, and can accept the greater memory and processing demands needed by these tree structures, then you may find it useful to generate documents through the DOM.
Note The DOM Level 1 specification, which includes all the functionality you use in this article, is available at http://www.w3.org/TR/REC-DOM-Level-1. If you're feeling curious, DOM Level 2 is available from http://www.w3.org/TR/DOM-Level-2. Information on further DOM development is available at http://www.w3.org/DOM/.
DOM Implementations The Document Object Model, as specified by the W3C, comes in several Levels, all of which provide scripts and programs to sets of document information through an API. The DOM API is officially specified through a CORBA IDL file (you don't need to know anything about that to use the DOM), but is more commonly used in its Java and JavaScript translations. The DOM doesn't specify everything about document processing and handling – for example, the W3C only addresses loading documents, creating new documents, and saving documents in the Level 3 work that's just getting started. As a result of this approach, the world of DOM implementations is somewhat fragmented. Besides the differences among the Java, JavaScript, and CORBA versions of the DOM, it's extremely difficult to write complete DOM code for multiple environments. While the core document generation may remain the same, the beginning and end of the process may vary substantially as you move DOM code from environment to environment and even from server to browser and back again.
Fortunately, the basic principles are pretty safe. If you learn the fundamentals of manipulating the DOM within an Active Server Pages (ASP) environment, you can transfer a substantial amount of that knowledge to work using Java XML parsers from Sun, IBM, Apache, and others, or the JavaScript processing built into Mozilla/Netscape Navigator 6. The basic concepts are the same across all of these systems, and the implementations should (hopefully) converge as the W3C releases more complete standards and developers build on those standards. The next section takes a look at the basic principles in one environment, ASP, while pointing out how the surrounding script may differ in other environments.
Note The DOM comes in various flavors as well as Levels. In this article, you work with the Core of the DOM Level 1 to generate code. More HTML-specific functionality is available in the HTML portion of the DOM Level 1. However, that functionality tends to be more appropriate to client-side dynamic HTML applications and usually isn't supported in the tools used to generate XML — even with an HTML vocabulary.
DOM Examples While the following examples use Active Server Pages as their development environment, they mostly use ASP as a programming environment and ignore its capability to create templates. While you can mix and match the DOM and template approaches, the surrounding material in the template may compromise the reliability of the markup created through the code. This is especially true if that template includes other generated content. Because the DOM Level 1 doesn't support "editing DocumentType" nodes, you still have to use the template portion for the XML declaration (if appropriate) and DOCTYPE declaration (required for XHTML 1.0 conformance).
Note Even if you don't use ASP or don't like ASP, the following examples include a lot of basic DOM vocabulary and usage that is applicable to developing in other environments. You create your first DOM-based XHTML document as a classic "Hello World." The code you create builds an XHTML document that looks like this:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-US"> <head> <title>Hello World!</title> </head> <body> <h1>Hello World!</h1> <p>Hello World!</p> </body> </html>
Because of the DOM's limitations, you have to create a shell for the XML declaration and the DOCTYPE that looks like this:
<%@LANGUAGE=JavaScript%><?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <!--Document-generating code goes here-->
Because most ASP implementations default to VBScript, and you're using the JavaScript bindings of the DOM from the specification, you need to tell ASP that you're using JavaScript. The XML declaration has to follow immediately because it is treated as a processing instruction (and likely ignored) if whitespace is included. The next portion of the code creates a document object you can manipulate using Microsoft's syntax. (The DOM Level 1 doesn't provide a standard mechanism for this process.) <% var myDoc=Server.CreateObject("Microsoft.XMLDOM"); Once you have a document, you need to create a root element (in this case, html). The createElement() method takes an element name for its argument and returns an element object you then can manipulate.
var htmlNode=myDoc.createElement("html");
The html element needs some attributes to declare the namespace and the languages you are using here. Once you have the html element, you can use the setAttribute method to create the xmlns, xml:lang, and lang attributes and set their values.
htmlNode.setAttribute("xmlns"," http://www.w3.org/1999/xhtml");
htmlNode.setAttribute("xml:lang","en-US");
htmlNode.setAttribute("lang","en-US");
The html element now exists, and it has a full complement of attributes, but you probably should establish it as the root element for the document.
myDoc.documentElement=htmlNode;
Now that you have an html element, it's time to create the rest of the document content. To do that, you create elements and text nodes and then attach the text nodes and element nodes to their parents. While it isn't very important which sequence you create the nodes in, that sequence of the code typically reflects the order of the document to keep debugging from getting too confusing. Let's start with the head and title elements:
var headNode=myDoc.createElement("head");
var titleNode=myDoc.createElement("title");
var titleText=myDoc.createTextNode("Hello World!");
titleNode.appendChild(titleText);
headNode.appendChild(titleNode);
htmlNode.appendChild(headNode);
You build the head element by creating all of its nodes separately and then adding them to their appropriate container elements. The createElement and createTextNode methods create elements and text respectively, while the appendChild method establishes the connections between these nodes.
var bodyNode=myDoc.createElement("body");
var h1Node=myDoc.createElement("h1");
var h1Text=myDoc.createTextNode("Hello World!");
var paraNode=myDoc.createElement("p");
var paraText=myDoc.createTextNode("Hello World!");
h1Node.appendChild(h1Text);
paraNode.appendChild(paraText);
bodyNode.appendChild(h1Node);
bodyNode.appendChild(paraNode);
htmlNode.appendChild(bodyNode);
Finally, you write out the XML document you've modeled using the xml method of the myDoc object. The xml method produces XML content with no additional whitespace. If you need whitespace, you can create text nodes containing it. This approach is hardly limited to generating XHTML whose content is known in advance. It also works easily with material coming from databases, forms, or other possibilities. The next example uses a static XHTML form to collect information and passes it to an XHTML-generating script.
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-US"> <head><title>Forms to XHTML</title></head> <body><h1>Address Collector</h1> <form action="genxhtml.asp" method="POST"> <p>First Name:<input type="text" name="firstname" size="20" /></p> <p>Last Name:<input type="text" name="lastname" size="30" /></p> <p>Address 1:<input type="text" name="address1" size="40" /></p> <p>Address 2:<input type="text" name="address2" size="40" /></p> <p>City:<input type="text" name="city" size="25" /> State/Province:<input type="text" name="state" size="25" /></p> <p>ZIP/Postal Code:<input type="text" name="postalcode" size="15" /> Country:<input type="text" name="country" size="25" /></p> <p><input type="submit" name="Submit" /></p> </form> </body></html>The recipient of this form information uses the DOM to insert the material into an XHTML document. Most of the techniques used look like those in the previous example, but explanations of a few differences follow the code:
<%@LANGUAGE=JavaScript%><?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<%
var myDoc=Server.CreateObject("Microsoft.XMLDOM");
var htmlNode=myDoc.createElement("html");
htmlNode.setAttribute("xmlns","http://www.w3.org/1999/xhtml");
htmlNode.setAttribute("xml:lang","en-US");
htmlNode.setAttribute("lang","en-US");
myDoc.documentElement=htmlNode;
var headNode=myDoc.createElement("head");
var titleNode=myDoc.createElement("title");
var titleText=myDoc.createTextNode("Address
of " + Request.Form.item("firstname") + " " + Request.Form.item("lastname") );
titleNode.appendChild(titleText);
headNode.appendChild(titleNode);
htmlNode.appendChild(headNode);
var bodyNode=myDoc.createElement("body");
var nameNode=myDoc.createElement("p");
var firstNameNode=myDoc.createElement("span");
firstNameNode.setAttribute("class","firstName");
var firstNameText=myDoc.createTextNode(Request.Form.item
("firstname"));
firstNameNode.appendChild(firstNameText);
var nameSeparatorNode=myDoc.createTextNode(" ");
var lastNameNode=myDoc.createElement("span");
firstNameNode.setAttribute("class","lastName");
var lastNameText=myDoc.createTextNode(Request.Form.item
("lastname"));
lastNameNode.appendChild(lastNameText);
nameNode.appendChild(firstNameNode);
nameNode.appendChild(nameSeparatorNode);
nameNode.appendChild(lastNameNode);
bodyNode.appendChild(nameNode);
var addressNode=myDoc.createElement("div");
addressNode.setAttribute("class","address");
var line1Node=myDoc.createElement("p");
line1Node.setAttribute("class","line1");
var line1Text=myDoc.createTextNode
(Request.Form.item("address1"));
line1Node.appendChild(line1Text);
addressNode.appendChild(line1Node);
var line2Node=myDoc.createElement("p");
line2Node.setAttribute("class","line2");
var line2Text=myDoc.createTextNode
(Request.Form.item("address2"));
line2Node.appendChild(line2Text);
addressNode.appendChild(line2Node);
var cityNode=myDoc.createElement("span");
cityNode.setAttribute("class","city");
var cityText=myDoc.createTextNode(Request.Form.item("city"));
cityNode.appendChild(cityText);
addressNode.appendChild(cityNode);
citySeparatorNode=myDoc.createTextNode(", ");
addressNode.appendChild(citySeparatorNode);
var stateNode=myDoc.createElement("span")
stateNode.setAttribute("class","state");
var stateText=myDoc.createTextNode(Request.Form.item("state"));
stateNode.appendChild(stateText);
addressNode.appendChild(stateNode);
postalSpaceNode=nameSpaceNode.cloneNode(false);
addressNode.appendChild(postalSpaceNode);
var postalNode=myDoc.createElement("span");
postalNode.setAttribute("class","postalcode");
var postalText=myDoc.createTextNode(Request.Form.item("postalcode"));
postalNode.appendChild(postalText);
addressNode.appendChild(postalNode);
var countryNode=myDoc.createElement("p");
countryNode.setAttribute("class","country");
var countryText=myDoc.createTextNode(Request.Form.item("country"));
countryNode.appendChild(countryText);
addressNode.appendChild(countryNode);
bodyNode.appendChild(addressNode);
htmlNode.appendChild(bodyNode);
Response.write(myDoc.xml)
%>
In creating the title for the document, you combine multiple fields from the form into a single string of
text that becomes a single text node:
var titleText=myDoc.createTextNode("Address of "
+ Request.Form.item("firstname") + " " + Request.Form.item("lastname") );
This works well because the title element contains only text. If it were mixed text and elements, it
would require a more complicated approach (which I demonstrate the next time you use the name
information):
var nameNode=myDoc.createElement("p");
nameNode.setAttribute("class","name");
var firstNameNode=myDoc.createElement("span");
firstNameNode.setAttribute("class","firstName");
var firstNameText=myDoc.createTextNode(Request.Form.item("firstname"));
firstNameNode.appendChild(firstNameText);
var lastNameNode=myDoc.createElement("span");
lastNameNode.setAttribute("class","lastName");
var lastNameText=myDoc.createTextNode(Request.Form.item("lastname"));
lastNameNode.appendChild(lastNameText);
nameNode.appendChild(firstNameNode);
nameNode.appendChild(nameSeparatorNode);
nameNode.appendChild(lastNameNode);
bodyNode.appendChild(nameNode);
This chunk of code is notable for a number of reasons. First, because the name appears on a single
line, the entire name is contained in a single p element. Meanwhile, the first name and last name are
contained in span elements. The name can be constructed like it is in the title – just text – but using
span elements and class attributes preserves additional information about the content and makes it
possible to style parts of the name differently or address them as a group through client-side dynamic
HTML. Also worth noting is the creation of the name separator node – even though it's just a space, it
has to be created and appended explicitly. Later in the code, a different separator appears:
citySeparatorNode=myDoc.createTextNode(", ");
addressNode.appendChild(citySeparatorNode);
Companies that want to please the post office, rather than use a popular form, can change the commaspace
to just a space. Between the state and postal code, however, you can do something different:
postalSpaceNode=nameSpaceNode.cloneNode(false); addressNode.appendChild(postalSpaceNode);Suppose you want the space between the state and postal code to be the same as the space between the first and last names, but the same node can't have multiple parents. To avoid this complication, use the cloneNode() method; this method returns a new copy of the node's contents. By passing it the argument false, it only returns a simple copy of the node without dredging through possible layers of element content. You only want a space – this is a very simple node – so the argument doesn't matter very much. Yes, this might seem excessively complicated. On the other hand, it also helps ensure that your documents will be clean XML, every single time. The XHTML produced by the generator is once again without whitespace for the most part, although whitespace does appear where you explicitly add it:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en- US"><head><title>Address of Jimbo Jones</title></head><body> <p class="name"><span class="firstName">Jimbo</span> <spanclass="lastName">Jones</span></p><div class="address"><p class="line1">134 Rocket Science Way</p><p class="line2">Apt. 27B</p><span class="city">Out There</span>, <span class="state">NW</span> <span class="postalcode">00001</span><p class="country">USA</p></div></body></html>So far, this seems like an enormous amount of work to produce a relatively minor result. The benefits start to appear when you need to do more sophisticated things with your document structure, such as add entire sections to the document or transform one kind of document into another. In the next example, you add your address information to an existing XHTML document, using XHTML documents within the DOM as a new kind of template, an easily modifiable document. The template you use looks like this:
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-US"> <head><title>Your Prize</title></head> <body> <div /> <p>Dear Fool,</p> <p>You have won a million, trillion dollars!!!!! In laminated game money, that is. Please contact us to collect your prize at +1 888 555 1212. Shipping and handling fees of up to ten thousand dollars may be required to collect your prize.</p> <p><strong>Hahahaha!</strong></p> <p>the prize committee (we prys your money away from you!)</p> </body></html>While it may or may not be a legal letter to send someone (although it's an obvious parody), it is almost conformant XHTML. Next, you modify the DOM code you've been using to load this document, add the address information to the empty div element, and put it out as a letter.
Caution While it is preferable to include the DOCTYPE declaration in the template, the ASP engine and the XML parser seem to choke on XHTML documents that contain DOCTYPE declarations loaded as XML. We'll have to put that information into the script once again. Most of the code is the same as the DOM code used to generate the preceding address. The main difference lies at the start of the code where you load the template document and use it as a base.
<%@LANGUAGE=JavaScript%><?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<%
sourceFile=Server.MapPath("prizexhtm2.xml");
var myDoc=Server.CreateObject("Microsoft.XMLDOM");
myDoc.async=false;
myDoc.load(sourceFile);
var changeNode=myDoc.getElementsByTagName("div").item(0);
var nameNode=myDoc.createElement("p");
nameNode.setAttribute("class","name");
var firstNameNode=myDoc.createElement("span");
firstNameNode.setAttribute("class","firstName");
var firstNameText=myDoc.createTextNode(Request.Form.item
("firstname"));
firstNameNode.appendChild(firstNameText);
var nameSeparatorNode=myDoc.createTextNode(" ");
var lastNameNode=myDoc.createElement("span");
lastNameNode.setAttribute("class","lastName");
var lastNameText=myDoc.createTextNode(Request.Form.item("lastname"));
lastNameNode.appendChild(lastNameText);
nameNode.appendChild(firstNameNode);
nameNode.appendChild(nameSeparatorNode);
nameNode.appendChild(lastNameNode);
changeNode.appendChild(nameNode);
var addressNode=myDoc.createElement("div");
addressNode.setAttribute("class","address");
var line1Node=myDoc.createElement("p");
line1Node.setAttribute("class","line1");
var line1Text=myDoc.createTextNode(Request.Form.item("address1"));
line1Node.appendChild(line1Text);
addressNode.appendChild(line1Node);
var line2Node=myDoc.createElement("p");
line2Node.setAttribute("class","line2");
var line2Text=myDoc.createTextNode(Request.Form.item("address2"));
line2Node.appendChild(line2Text);
addressNode.appendChild(line2Node);
var cityNode=myDoc.createElement("span");
cityNode.setAttribute("class","city");
var cityText=myDoc.createTextNode(Request.Form.item("city"));
cityNode.appendChild(cityText);
addressNode.appendChild(cityNode);
citySeparatorNode=myDoc.createTextNode(", ");
addressNode.appendChild(citySeparatorNode);
var stateNode=myDoc.createElement("span")
stateNode.setAttribute("class","state");
var stateText=myDoc.createTextNode(Request.Form.item("state"));
stateNode.appendChild(stateText);
addressNode.appendChild(stateNode);
postalSpaceNode=nameSeparatorNode.cloneNode(false);
addressNode.appendChild(postalSpaceNode);
var postalNode=myDoc.createElement("span");
postalNode.setAttribute("class","postalcode");
var postalText=myDoc.createTextNode(Request.Form.item("postalcode"));
postalNode.appendChild(postalText);
addressNode.appendChild(postalNode);
var countryNode=myDoc.createElement("p");
countryNode.setAttribute("class","country");
var countryText=myDoc.createTextNode(Request.Form.item("country"));
countryNode.appendChild(countryText);
addressNode.appendChild(countryNode);
changeNode.appendChild(addressNode);
Response.write(myDoc.xml);
%>
The main activity in this script that differs from the prior example is in the code at the beginning that loads the template:
sourceFile=Server.MapPath("prizexhtm2.xml");
var myDoc=Server.CreateObject("Microsoft.XMLDOM");
myDoc.async=false;
myDoc.load(sourceFile);
var changeNode=myDoc.getElementsByTagName("div").item(0);
The technique for loading files is a Microsoft extension, once again unspecified by the W3C DOM specs. Basically, this code creates a full path to a file in the same folder as the script, which is used as a template. The XML parser then parses that file – setting myDoc.async to false ensures that the entire document is loaded before processing continues. Then you grab the empty div element so that you can put the information you receive from the form into that element. The source code behind that generation is also interesting. It shows some inconsistencies in how the Microsoft XML parser handles whitespace from documents it loads as opposed to whitespace from documents created through code.
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-US"> <head><title>Your Prize</title></head> <body> <div><p class="name"><span class="firstName">Jimbo</span> <span class="lastName">Jones</span></p><div xmlns="" class="address"><p class="line1">134 Rocket Science Way</p><p class="line2">Apt. 27B</p><span class="city">Out There</span>, <span class="state">NW</span> <span class="postalcode">00001</span><p class="country">USA</p></div></div> <p>Dear Fool,</p> <p>You have won a million, trillion dollars!!!!! In laminated game money, that is. Please contact us to collect your prize at +1 888 555 1212. Shipping and handling fees of up to ten thousand dollars may be required to collect your prize.</p> <p><strong>Hahahaha!</strong></p> <p>the prize committee (we prys your money away from you!)</p> </body></html>
While these examples are fairly simple, you can apply the same mechanisms to tasks such as building tables around information from database or XML document structures, rearranging document content, or deleting pieces from a document.
Making Logic and Structure Mobile The Document Object Model and the code it tends to produce are both somewhat unwieldy, but the results can trim unwieldy projects down to size. The Document Object Model lurks at the boundary between HTML and XML, developed with an eye toward the former but quite useful for tasks involving the latter. On the browser, you may want to take advantage of its features for addressing the HTML vocabulary and various understandings built around that vocabulary. On the server, you can use it to create documents from an XML perspective. XHTML requires an understanding of both of these perspectives, so the DOM is a natural fit.
Perhaps the most important thing about the DOM is that it enables you to partition your applications among different systems however you find appropriate. In this regard, it is much like Extensible Stylesheet Transformations (XSLT) described in the last article – but to some extent, it is even more powerful. Because the tree structures created by parsing documents into a DOM remain manipulatable, and aren't simply the output of a transformation, the DOM offers flexibility that goes well beyond the simple document generation just shown (although not well implemented across browsers yet). You could move (if appropriate) the scripts for creating documents and combining documents to client browsers, which then would run the same code on the browser and generate the same document. The Microsoft-specific features used to create and output the document would need updating (as even Internet Explorer uses slightly different syntax for these), but the core logic is easily transferred. (Hopefully, the development of DOM Level 3 will complete this picture and make the logic fully transferable.
This combination of features, some of which are admittedly promises, may mean that XHTML and the DOM finally will make the old promises of dynamic HTML viable. Building applications that run inside of (and outside of) browsers using the data transmitted over the Web for more sophisticated things than pop-up outlines and drag-and-drop games will be a lot easier, even in situations that require support for multiple browser environments.
legal notice
Our website is not responsible for the information contained by this article. Web-articles is a free articles resource.
Suggestion: If you need fresh, daily updated content for your website, feel free to use our service. Click here for more information.
Useful tools and features
related articles
Coding Styles— XML and XHTML's Maximum Structure Overview XML parsers are far more brutal about rejecting documents they don't like than are HTML browsers. XML's clear focus on structure demands that the practices described in the previous chapter must change. However, most of those changes shouldn't cause more than minor inconveniences – at least for newly created documents. Note If reading this chapt...
2. XML and CDATA
Processing instructions XML also enables developers to pass information to the application through processing instructions (often called PIs). Processing instructions use a similar syntax to the XML declaration, although the rules for them are much less strict. Processing instructions begin with <? and end with ?>, but the developer generally dictates their contents. The first bit of text before a space appears in a PI is called the target. The target must start with a letter, unde...
3. lang Internationalization
Internationalization: xml:lang and lang Internationalization (often abbreviated i18n because 18 characters appear between the i and the n) gets a significant boost with the shift to XML primarily because of XML's use of Unicode as the underlying character model. While not every document needs to encode Chinese, Cyrillic, Arabic, and Indian characters, Unicode makes it possible for all of these forms to exist within a single document. In addition, XML and XHTML allow for the possibility of other e...
4. Anatomy of an XHTML Document
The transition from HTML to XHTML will come with a fair number of bumps. While later chapters introduce tools to help you get past those bumps – and figure out where they come from – this chapter examines what's going to change and demonstrates a few strategies for handling those changes. Along the way, we visit the ghosts of browsers past and explore problems that exist in current browsers. In turn, you discover how prepared and unprepared various tools are for XHTML. Note Som...
5. Converting to strict HTML and XHTML
Converting to strict HTML You start out by declaring your intentions to use the strict HTML 4.01 DTD by putting the appropriate DOCTYPE declaration at the head of the document: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> Now the first section of the document, including the HTML opening tag and the HEAD element and its contents, is fine except for one line. The SCRIPT element no longer supports a LANGUAGE at...
6. Reading the XHTML DTDs A Guide to XML Declarations
Reading the XHTML DTDs: A Guide to XML Declarations Although the W3C has long had document type definitions (DTDs) for HTML, few developers actually use those DTDs as a foundation for learning HTML. XHTML 1.0 simplifies those DTDs with the slightly friendlier XML syntax – they previously used SGML's more complex syntax – and the increased emphasis on validation may lead developers to explore them more closely. Making good use of XHTML 1.1 requires some level of ...
7. Defaulting attribute values XHTML DTDs
XML 1.0 also provides a set of tools for specifying what happens if an attribute isn't declared within an element. Four different possibilities exist, including "the attribute just isn't there"; "the attribute must be there, period"; and "the attribute has this value, period." You already have seen a few uses of these choices in the preceding declarations. In the img element, for instance, the src and alt attributes are required (#REQUIRED); meanwhile, most of the rest of its attribute content is optio...
8. Exploring the XHTML DTDs
Exploring the XHTML DTDs Choosing Your DTD XHTML 1.0 provides three DTDs that describe different sets of XHTML elements and reflect the three choices provided in HTML 4.0: strict, transitional, and frameset. The probably the one that the W3C would like to see developers adhere to, but transitional DTDs reflect the reality of HTML usage much more accurately. Appendix A lists the in the three different DTDs, along with notes regarding attributes. To identify the DTD for a ...
9. Building XHTML DTD Structure Element and Attribute Declarations
Building Structure: Element and Attribute Declarations After all of these preliminaries, it's finally time to make some real declarations, creating the elements and attributes partly described by the entities established so far. This portion of the DTD is broken down into segments that reflect groupings of element types, foreshadowing to some extent the modularization process that XHTML 1.1 will perform. If you have trouble getting your XHTML documents to validate, you need to explore this portion of the ...
10. Style Sheets and XHTML
Cascading Style Sheets (CSS) is an enormously powerful tool that has been slow to catch on in the HTML development world. Whether or not you use (or like) CSS, the continuing evolution of CSS is deeply intertwined with the work moving forward on XHTML so learning about CSS can help you understand XHTML as well as implement it. Fortunately, CSS isn't very difficult once you master a few key structures and learn to apply its vocabulary. There are some real problems with existing CSS implementations that I cover later...
