XHTML CCIS1301 - Class 2

Class 2 - XHTML basics, validators, editors, and more

Clients and Users

Before we get too deep into this course, I would like to take a moment to remind people why we learn XHTML. Obviously, many of you are going to learn, in this course, how to properly create a valid XHTML web page. But why would you need to learn such a thing? Simple, there has been some demand placed on creating it. Whether that demand is you, personally, wanting to create a web page for yourself or your business, or your boss requiring you to add or update the corporate website, at some point, someone will require you to create XHTML.

The client is always right

In the fast paced world of technology, often people are more focused on IF something can be done, and less focused on SHOULD something be done. Many times, people have not taken the time to think through the full ramifications of a request, and it is your job to make sure that what the client NEEDS, not what they WANT, is what you deliver.

This can lead to some tension between you and the client, but it is important to understand that you're roll as a developer, designer, consultant, or "web master", is to make sure the web site functions correctly, and serves the people using the web site. Clients WILL come to you with rediculous requests. It is your role as an employee to push back with solid, lucid, arguments to support whether a given request will be good for the client.

TIMTOWDI - a great acronymn which means "There Is More Than One Way To Do It". I'll come back to this over and over throughout the course. It's important to remember this concept as many times, a client request doesn't need to be discarded, but rather, alternate options (presumably better options) may need to be considered. Knowing different ways to solve a problem is critical in development.

HTML Comments

Comments are little notes that you can write to yourself and other developers, in your XHTML code that will help you maintain and modify the code at a later date. This is particulary useful when you haven't worked on a project for a while. Comment notes should be something helpful like "This is where the left navigation begins" or "This is the end of the main content". Below is an example of an HTML comment:

<!-- This is an HTML comment -->

Note that comments start with an angle bracket, then an exclaimation point, then two hypens and a space. Next comes your comment text, then a space, two hyphens and a closing angle bracket. Comments do not have opening and closing tags and cannot be nested. So the code below is not valid...

↓ THE CODE BELOW IS WRONG ↓
<!-- This is an HTML comment 
	<!-- This is an HTML comment -->
--> 
↑ THE CODE ABOVE IS WRONG ↑

The XML Prologue

As we discussed in our first session, XHTML is a subset (that is, part of) XML. It has the same rules. That means, we need to follow the rules of XML when we create a valid XHTML document, right? Well, yes and no. There are practical implications for following the rules "too well", and one of these is the XML prologue.

What is the prologue?

The XML prologue, is a single line of code at the VERY BEGINNING of your XHTML (or XML) document, that defines what the document is. This proluge tells the application using the page, what the content is. An example is below:

<?xml version="1.0" encoding="utf-8" ?>

What does that mean? It's telling the application that this document is "XML", it's using XML version 1, and the character data will be in UTF-8 format. The important thing here is the encoding type "utf-8". UTF-8 is the international character set, and includes all the letters and numbers we're familiar with, as well as chinese, japanese, syrillic, and other characters in other languages. Other encoding types include:

Encoding Type	Definition
UTF-8	International characters set
ISO8859-1	Standard english characters set
us-ascii	ASCII only characters set

For a complete list of encodings, check out this URL: http://www.iana.org/assignments/character-sets

Why would we NOT use the prologue?

Simple, browser display / rendering compatibility. If you do a google search for "quirks mode", you'll see all sorts of links relating to browser display problems. IE (Internet Explorer) was NOTORIOUS for being bad at displaying proper XHTML. So, with IE6, they started inplementing a "Standards Compliant Mode" and a "Quirks Mode". That's right, IE intentionally set up their brower to work WRONG! (yikes). So, the question is, how did the browser know whether to use "Standard" mode and display the page correctly, or to use "Quirks" mode? The answer - the XML prologue. Pages with this prologue usually render according to the W3C specification, pages without the prologue will render incorrectly in some browsers. However, it has been my experience that, as of today, to get a more consistent rendering, it's best to leave OUT the prologue.

IMPORTANT HOMEWORK NOTE!!!

Do not include the XML prologue in homework assignments!!!

The DOCTYPE declaration.

Like the XML prologue, the DOCTYPE declaration is used to send information to the browser about the page. And like the prologue, the DOCTYPE and trigger the Quirks / Standard mode nightmare. The purpose of the DOCTYPE, for the scope of our class, is to tell the browser what TYPE of XHTML we're using.

An exaple of a DOCTYPE declaration:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

This particular DOCTYPE, identifies the page as using XHTML1 Strict.

An additional similarity between the DOCTYPE and the prologue, is that the DOCTYPE MUST be at the top of each page, as the first line. THIS IS REQUIRED FOR ALL HOMEWORK!!!

Valid DOCTYPES for our class

Each homework assignment must use one of the 3 DOCTYPES below...

XHTML1.1: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
XHTML1.0 Strict: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
XHTML1.0 Transitional: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

You may not use the HTML5 DOCTYPE

Homework submitted without one of the above DOCTYPEs will receive 0 points for validation (50% of the homework points).

Meta Tags

Meta tags are special HTML tags that go between the opening and closing <head> tags. These tags are used in various ways to help other systems identify what your page is about.

There are several popular "groups" of meta tags that we need to cover:

Group 1: SEO Meta Tags

These meta tags are generally used by search engines to help categorize your web page content and include:

Description Meta Tag: <meta name="description" content="This is my web page summary here" />
Keyword Meta Tag: <meta name="keywords" content="keyword1, key phrase, any list of words and phrases seperated by commas" />
Robots Meta Tag: <meta name="robots" content="index,follow" />
Revisit Meta Tag: <meta name="Revisit-After" content="3 days" />
Expires Meta Tag: <meta name="Expires" content="Tue, 01 Jun 1999 19:58:02 GMT" />

Note that all these tags are self-closing (from class 1).

Group 2: Dublin Core Meta Tags

The second "group" of meta tags are commonly referred to as the Dublin Core group. These meta tags can be found here: http://dublincore.org/documents/dcq-html/

Generally, Dublin Core meta tags are used to more specifically define content, author information and formatting information. Some are used by browsers to help intepret the rendering, and others are basically ignored and are useful only to other developers.

Group 3: Open Graph Meta Tags

These are used by third party applications such as facebook, embed.ly, and more to allow you to post links in their systems, and read meta data for the proper images and text to use in the posting. A good write up on their usefullness is on facebook here: Facebook Open Graph Meta Tags information.

Start with the Data

Tables vs. CSS

Years ago (late 90s), web designers and programmers were limited by what a browser could do when rendering HTML. The adoption of CSS (Cascading Style Sheets) was in its infancy, and as such, it was nearly impossible for designers to take visually complex designs and layouts, and code XHTML to match the graphic layout exactly.

Since many of the designers came from the print industry, they quickly realized that there were a few tags in HTML that they could adapt to service their needs - mainly the <table> tags. The reason they used the table tag was simple - it allowed you to layout the page in a grid design (think excel spreadsheet). So programmers happily started using this grid-based concept to create their web pages using table tags all over the place. For a while it worked well, until there were so many tables on a web page that you could no longer figure out what the actual content was. This presented a HUGE challenge to web developers (maintenance became a nightmare) and search engines (as they couldn't extract the vital information from a web page very easily.

With the wider adoption of CSS, a new method was presented to designers to layout pages - without using table tags. This is generally referred to as a "CSS Based Design", and has several distinct advantages over "Table Based Design". Those being: lower content to code ratios (that is, less code used to display text), smaller file sizes, easier redesigns, and better XHTML semantics.

The change from a Table based design / coding methodology to a CSS based design required a shift in thinking from the HTML programmers persepctive. With a Table-based design, the HTML programmer would look at the layout and design elements in terms of a "grid" or graph paper concept. Then create tables to layout the page. Whereas with a CSS based design, the programmer correctly looks at the layout from a data perspective, and used the correct tags for each content element, and uses CSS to position the page.

Example: Amazon.com

Looking at amazon.com, we can see a simple "grid" layout, and it would be easy to simply create a table based on this "grid" concept. However, we also notice that there are several sections in the page. Those sections include:

Site header (with logo, account information and more)
Left navigation (with a list of links)
Main content area (with features)
Suggested items (in the main content area), with blocks for each item
Footer area with more lists of links.

As you can see, we can identify several sections on the site (or blocks) which are logically related. Within each of these blocks, we can see other types of data (lists, images, headers), all of which have a designated XHTML element (which we will learn about in the rest of this course).

Our first XHTML tags

Today, we will start with about 10 XHTML tags in addition to the ones we looked at last week. They are:

<address> - This tag is used for any physical street address (i.e. 100 Main St)
<blockquote> - This tag is used to block indent a large section of text
<div> - This tag is a generic block level container. It is used to logically associate and encapsulate sections of a web page. Example: the header of a page will contain a logo, some account links, and the navigation. These sections would all be inside a "div" tag.
<p> - This tag is used for paragraph text
<br /> - This tag is used to create a hard break inside text.
<pre> - This tag is used to pre-format text.
<h1> - This is one of a set of tags (h1, h2, h3, h4, h5 and h6), which are used for page, and section headings.

Attributes

As mentioned last week, in addition to the element name, we can also specify additional information for our XHTML elements. The way we do this is with an attribute name=value pair. Example:

<tag attribute="value" attribute="value">Some Content</tag>

A real example might be:
<table summary="sales for quarter 4" style="width:500px;">[TABLE DATA HERE]</table>

In the real example, we are giving the table tag 2 attributes. The first "summary", has a value of "sales for quarter 4", the second "style", has a value of "width:500px".

IDs and Classes

As I mentioned earlier, we use divs to logically associate elements on a page, this gives us one method of identifying what content is where on the page. Another method is to give an XHTML element a specific attribute of either "id" or "class" or both. An example:

The example above has an attribute name of "id" and an attribute value of "first_paragraph". Now consider the following...

The example above has an attribute name of "id" and an attribute value of "first_paragraph" as well as another attribute name of "class" with its attribute value of "main_section_text". We will get into classes and attributes more in the CSS section of this course (which is where they have a huge impact). For now, you need to know the following:

All id attribute names must be unique on the page. Meaning, for any page, if you have one tag with an id value of "first_id", there can be NO OTHER id with a value of "first_id".
Class names can be used to logically associate similar elements. You can have more than one tag with the same class value. Meaning, if you have a paragraph (<p>) with a class attribute value of "main_content_text", you can have other tags with the same class attribute value.

Validators

There are many XHTML validation programs out there, but for the purpose of this class, we will be using one: http://validator.w3.org.

Validating using the W3 site is easy. Simply open the browser to that page (above), and select the validation method you want. For the most part, in this course you will be using "Direct File Upload". Simply upload your XHTML file and check the validation. If you get a Green bar at the top, you passed! Otherwise you'll need to correct the errors identified. Below are some common errors, and what they mean and how to correct them.

no document type declaration; will parse without validation

Cause: your file is missing a <!DOCTYPE> declaration.
Solution: Add a <!DOCTYPE> declaration.

general entity "XYZ" not defined and no default entity

Cause: XHTML requires URL parameters to be character entities (we'll go over this later). So make sure any "&" is actually an "&".
Solution: Change all "&" to "&".

there is no attribute "SOME_VALUE"

Cause: You're using an attribute that's not allowed.
Solution: Change your attribute name or don't use the attribute.

end tag for "p" omitted, but OMITTAG NO was specified

Cause: You opened a tag (in this case a <p> tag) but did not close the tag (i.e. missing </p>
Solution: Make sure all your tags are closed, and properly nested.

XHTML editors

For the purpose of this class, you should be using a text-editor for creating your XHTML code. A list of suggested editors:

TextPad (Windows / Mac)
BBEdit (Mac)
bluefish (Linux)
Notepad++ (Windows)
Notepad (Windows)

The following editors SHOULD NOT BE USED in this course.

Microsoft Office (Word, Excel, etc)
Dreamweaver (Adobe)
Any What You See Is What You Get (WYSIWYG) editor.

8 simple rules of XHTML! - again!

All pages must have a DOCTYPE.
All pages must have 1 and only 1 html tag.
All pages must have 1 and only 1 head and title tag.
All pages must have 1 and only 1 body tag.
All tags must be in lowercase.
All tags must be closed (either with a closing tag or self closed).
All attributes must be in lowercase.
All tags must be properly nested (i.e. closed in reverse order of opening).

Ask questions.

Assignments

- Homework 1: Due on: 01/27/2012 at 11:59:59pm.
- Quiz 1: Due on: 02/03/2012 at 11:59:59pm.
- Lab 1 (optional)

Helpful Pages

Character Encodings - list of encoding types.
Valid DTDs - List of valid DOCTYPEs
Meta Tags - A guide to some of the most common meta tags
Dublin Core - All the dublin core meta tags and information about them.
Open Graph Meta Tags - From facebook
w3schools.com - Attributes

XHTML - CCIS1301