Anatomy of an xHTML page

Up until this point, we have only dealt with the content (xHTML between the <body> and </body> tags) and styling of a page (through the use of an external Cascading Style Sheet), but there is more to creating a web page than that. In addition, each page also has what is called a <!DOCTYPE> declaration, <html> and </html> tags, <head> and </head> tags, and other things that are nested between the <head> and </head> tags.

Why am I saving the first for last in the beginner’s tutorial series? Because many people are only interested in learning this stuff for their blogs or message boards, and these other things I’m about to tell you about are already included in the default (x)HTML of blog and message board software. Therefore, if you are only interested in customizing your blog and/or message board and not building a web page from scratch, you can ignore the rest of this beginner’s series.


When using xHTML, the first line in your document should look like this:

<?xml version="1.0" encoding="iso-8859-1"?>

This is to tell the browser what type of encoding to use when translating your pages into something that’s readable by humans and speech synthesizers.


No, I’m not yelling at you by typing that in ALL CAPS. That is actually how it looks in the tag.

The doctype specifies what type of document you are creating, and there are three different doctypes in xHTML…

xHTML 1.0 Strict

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"

If you look at the source code for practice1.htm, you’ll see that “strict” is what we’ve been using in the practice exercises. This is because I believe in learning things right the first time, without picking up bad habits along the way. This is also why I chose to teach xHTML while leaving HTML alone, because HTML is too lenient and allows people to pick up a lot of bad habits, which would only make it harder for you to make the switch to xHTML, once old-school HTML gets phased out completely.

If you find it too difficult to work with xHTML “strict,” you may decide to use…

xHTML 1.0 Transitional

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"

The “transitional” doctype is more lenient because it allows you to use some of the more deprecated (outdated) HTML, while still having the control that xHTML gives you.

Personally, I prefer “strict” when I’ll be writing all of the code for a page myself, but “transitional” can be useful when multiple people with less coding experience will be adding content to the site.

xHTML 1.0 Frameset

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN"

I feel I would be negligent to omit this one, although I strongly recommend that you never use a frameset. They are horrible for search engines and horrible for accessibility, and there is never a need to use one.

If you feel that a frameset is handy for adding navigation that will be the same on every page, you can do the same thing with “server-side includes.” With includes, you can update the content in one file, and it will be a part of every page where the include was inserted.

I will explain how to use server-side includes in a future tutorial. For now, let’s continue with the anatomy of a web page.


This starts the actual document, as the xml and doctype declarations aren’t actually part of the document, but rather, they define the encoding on the page and what type of document it is.

The <html> tag should include the xmlns attribute, which stands for “xml namespace”. It’s also good to include what direction (dir) the text is flowing, ltr meaning “left to right” and rtl meaning “right to left”, the latter of which is used for languages such as Hebrew or Arabic. Finally, the lang attribute will help browsers display text better, especially if it’s something other than English.

A typical <html> tag that you can copy and paste into any xHTML web page that will be written in English:

<html xmlns="" dir="ltr" lang="en-US">

Other “lang” values you can use are:

  • en-GB
  • en-AU
  • es (Spanish)
  • fr (French)
  • it (Italian)
  • pt (Portuguese)
  • de (German)
  • ru (Russian)
  • ar (Arabic)
  • zh (Chinese (Mandarin))
  • he (Hebrew)
  • ja (Japanese)
  • ko (Korean)

Why differentiate between “en-US”, “en-GB”, and “en-AU”? Because some aural screen readers will actually read in an American, British, or Australian accent, depending on which one you specify. 🙂

The head is where you put things such as the <title> tag, <meta> tags, links to your stylesheets and javascript, and anything else that isn’t a part of the actual content.

Putting it all together

This is what we have so far:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
<html xmlns="" dir="ltr" lang="en-US">

            title, meta, and other non-content tags


            page content here, which you have already been working with


Next, we’ll move on to the tags that go into the <head> section.