The magazine of the Melbourne PC User Group

SGML/XML/XHTML Resources - Part 4
Major Keary
majkeary@netscape.com.au

Why XHTML?

Browsers that parse HTML are bloated; the reason is that vendors have been at pains to accommodate sloppy and ill-formed coding. Browser software has to be expert-system enabled in order to cope with poor HTML coding practice. That has added significantly to browser size and complexity.

By implementing a strict language in which there is no tolerance of lax markup syntax, and documents must be well-formed, browsers can be made slimmer and more efficient. The introduction of XML met those requirements, but still left a problem of legacy browsers. Experience has demonstrated that "it takes about two years for seventy-five percent of the net population to update to the most current browsers" (Beginning XHTML). 

Another factor is the introduction of non-browser display devices, called Web-enabled user-agents: mobile phones, hand-held devices, Web-TV, and so on. Those devices simply don't have the processing capacity to cope with the baggage required to parse non well-formed documents.

XHTML provides a solution. It can be parsed by existing browsers, is portable between browsers (gets different browsers to show the same thing), and lends itself to the growing family of Web-enabled user agents.

Differences between HTML and XHTML

It is not so much a matter of learning new things as a matter of changing one's habits. Some elements are deprecated, which means future browsers will not recognise them. Their respective functions will be handled by cascading style sheets (CSS).

Case sensitivity: all XHTML tag names and attribute names must be in lower case. 

All tags must be closed: each tag must be accompanied by its closing form, such as
       <p> ..... </p>

Empty elements: an example of an empty element is <hr>, used to insert a horizontal line; it is empty and does not require a closing tag. However, in XHTML such empty tags have to observe a special form, <hr/>, and in
some cases a space has to be inserted before the slash.

Attributes require quote marks: for example, the commonly used form
      < INPUT CHECKED >
has to be entered as
      <input checked ="checked">.

Tags must be correctly nested: the correct sequence must be observed, as in:
    <tag1>
             <tag2>
                           <tag3> . . . </tag3>
              </tag2>
    </tag1>
Things like
<H1> <I> <B>A Heading</H1> </I> </B> 
will return an error; apart from not being in lower case it is essential to observe a last-in, first-out order:
<h1><i><b>A Heading</b></i></h1>

DTD and namespace: the significance of these items will be explained in a future article; for the present just take my word that a DTD declaration has to appear at the top of a file. It will be something like this:
<!DOCTYPE PUBLIC "-//W3C XHTML 1.0 Strict//EN" "" >
and there must be something like this:
<html xmlns'"http://www.w3.org/TR/xhtml!">

Those simple rules remove an enormous processing burden. I suspect the reason for lower case tags and elements is to enable the parser to distinguish between XML and XHTML tags and elements.

Beginning XHTML

A number of XHTML titles are on the way; amongst the first to hit the shelves is Beginning XHTML from Wrox Press. Titles from this publisher are highly regarded by professional developers and programmers; the books are easily identified by a common livery: red covers with yellow type, and author photographs. For XML-related texts those released by Wrox are remarkable for their quality of content and depth of technical information. A web site provides support, additional information, errata notices, and downloadable code examples for their texts.

This title is in the Beginning series, which is designed to teach newcomers everything they "need to know from scratch, in a fast-paced tutorial fashion". If you are new to computing in general-and writing HTML code in particular-this is not the place to begin. If, on the other hand, you are "comfortable with computing and you learn fast", then Beginning XHTML is an ideal learning tool. As the authors' point out, "there's a lot packed into this book" and it will take you "a little deeper into some technologies you may not have met before".

There are three distinct parts: XHTML basics; XHTML web page design issues; and making web pages interactive with forms and scripting.

The first chapter is an excellent introduction to the Web; for anyone who provides presentations that introduce new users to the Web and the Internet it offers a good foundation for speaking and course notes.

The next chapter explains the transition from HTML to XHTML and why it is necessary. The book then moves on to a tutorial style introduction to hands-on XHTML, explaining what all those esoteric terms (elements, links, lists, attributes, cascading style sheets, and the like) mean, and how they are used. It is very thorough, right down to teaching "some of the theory behind the XHTML language by looking at ... XM U'.

Chapters in the second part focus on design issues, the problem of coping with browser variations, and multimedia.

In the third part readers are introduced to the use of forms for obtaining user-input. There is a tutorial on using JavaScript, an explanation of the use of frames to create scripts that have multiple-page application, and an introduction to Mozquito.

Mozquito is a solution to the problem of writing code that will perform equally on all browsers. It uses markup, rather than scripting, in the Forms Markup Language (FML). There is not room here to describe how it works, but Beginning XHTML provides a very good introduction. Have a look at http://www.mozquito.com for more information and downloadable files.

Beginning XHTML is-assuming you have the capacity and will to maintain the pace-an essential text for anyone who wants to learn XHTML. It is a tutorial, reference, and comprehensive resource, and is the kind of book that will be used constantly by web authors.

Boumphrey et al.: Beginning XHTML
ISBN 1-861003-43-9
Published by Wrox, 733 pp.
RRP $83.95

Reprinted from the August 2000 issue of PC Update, the magazine of Melbourne PC User Group, Australia