XHTML - CCIS1301

Class 1 - goals, expectations, and introductions to XHTML.

Goals

From the college: This HTML course will introduce students to the basics of XHTML (the web markup language) and prepare them for more advanced studies. Students will learn XHTML from the ground up, beginning with solid HTML concepts. Standards-based instruction will stress designing for backward and forward compatibility, usability, and accessibility. Students will develop and publish Web pages that include XHTML techniques while using tables, frames, and forms.

What that means: I'm going to present materials to you to help you learn the basics of XHTML, and how to create simple web pages. We're not going to get into really advanced stuff (like working with databases, etc). The goal here is to get a base understanding of how to create a web page, to prepare you for later classes (which will rely on this information).

My Goals: I try not to take things too seriously, and I want this to be fun. While there's a lot to read through in this course, please note that I've added some links on the right that will help you play around with HTML. Also, for those of you that like videos, for each lecture / session, I'm going to try to find 3 - 5 videos that I think help explain, in detail, the topics we're covering. At the end of this course, if you get an "A", you should be able to set up your own web site without too much difficulty.

Expectations

Computer Knowledge: You should have a basic understanding of how to use your computer and its operating system (Windows, OSX, Linux, etc) as well as the following:

If you do not feel comfortable with the list above, you may not be ready for the demands of this course.

Online / Hybrid courses: Some of you are taking this class as a hybrid class (i.e. part online, part in class lecture), while others are taking this course 100% online. Both courses will have new material posted every other week, and both will have the same due dates for homework, labs, quizzes and the final project.

Homework: All homework, labs, and quizzes are required to be turned into D2L on time. If you are unable to complete an assignment on time contact me BEFORE IT IS DUE. All assignments not turned in on time will receive 0 points unless a previous arrangment is made PRIOR to the due date.

All homework will be graded according to these requirements:

IMPORTANT!!! Validation is an all or nothing deal. Your page will either validate or it won't. I will be validating using the W3C validator (http://validator.w3.org) for validation.

Extra Credit: Extra credit (5 points) is available for every homework assignment. To receive the extra credit, the assigmnent must be turned in at least 1 week prior to the due date, and you must state in D2L that you want the assignment to be graded with the Extra Credit option. NOTE!!! I will only grade a homework assignment once!

Labs: All labs are optional. The labs may serve one of two purposes: 1) Provide you with research and understanding of a concept or 2) Provide you with hands on experience with a topic.

Quizzes: All quizzes are worth 20 points and usually consist of a question / answer portion (i.e. multiple choice, fill-in-the-blank, etc) and a functional usage portion (usually correcting invalid XHTML / CSS code).

Grades / Assignments: Information on total points, and grading scale is available on the homepage.

Introduction

History of the Internet

While there are varying stories around how the internet evolved into its current structure, there are some general concepts that you should know. The "internet" as we know it today was primarily created as a method of insuring communications between computer systems in the event of a significant disruption (such as one caused by a nuclear war). The primary force driving the creation of the internet was the US Defence department agency - DARPA, who created the first interconnected computers. DARPA and later ARPA, created the first networks known as "ARPANET". From these humble origins, the communication mediums, protocols and methods were established and evolved into todays "world wide web".

Once the basic structure was in place, a common method of formatting data was required to easily transmit text back and forth. The initial construct of this data led to SGML (Standard Generalized Markup Language), which was basicially a very specific way to format data such that it could be easily transmitted and read between computer systems.

With the advent of the http (hyper text transfer protocol), newer, more effecient methods of formatting the data became available through the creation of HTML (hyper text markup language). The format of both SGML and HTML were similar - that is, both contained a set of keywords to allow the reading computer to evaluate the data. However, HTML was specifically designed for use between computer applications using an HTTP connection (i.e. a web browser). The web browser, today, is the primary method we use to communicate data back and forth over the internet via an HTTP connection using HTML for formatting information.

What's in a name?

Ever wonder what happens when you visit a web site? While the process is quite complex, it can be boiled down into a few simple concepts. First, we need to understand some terminology:

Domain Name:
A user friendly human readable letter / number combination used to identify a specific web site or server. For example: aphion.com, facebook.com, aol.com, cnn.com.
IP Address:
A numerical system used to identify a specific server on the internet.
DNS (Domain Name Server):
A system of computers that map a domain name (aphion.com) to an IP address (127.0.0.1)

Now that we know some terminology, understanding what happens when you type in your web browser is fairly simple. The basics of this are: Your Web browser makes a request to the Domain Name Server with the domain name you entered (aphion.com) using the HTTP protocol. The domain name server send back the IP address for the domain name you entered, and your browser then requests the HTML data from that web server directly.

Note: It is possible to directly access most web sites using their IP address, if you know it.

Anatomy of a URL

Note: URL = Uniform Resource Locator

Consider the following URL: http://www.aphion.com/class/class1/index.php?name=something

This URL has the following parts:

In addition to a URL, there are also URIs (Uniform Resource Indicators). While the technical differences between theses aren't fully defined, nor fully implemented, I will say this. Typically URLs are used to address the location of a specific piece of data, which must exist. Whereas a URI is often used to define where seomthing should be, but not require it to actually be there.

Browsers

What is a Web Browser?

Put simply, a web browser is a computer application that helps users (you) view web pages. A typical web browser will allow you to type in a domain name (aphion.com) and will query the server for HTML data, and then take that HTML, parse the information, and display a web page on your screen.

Common Web Browsers:

While there are literally hundreds of web browsers out there, most of them fall into three general categories, which are important to know from a web design / development perspective. Each browser has something called a "rendering engine", which is the code that instructs the browser on how to display the HTML content. You'd think that these would be consistent between each other, but they are not. When you take the Advanced XHTML class, they may cover some of the subtle differences here. The rendering engines are:

Why do you need to know about rendering engines? Because, when you start testing your designs out in the real world, you'll want them to look similar if not identical in all browsers / operating system combinations. However, since there are literally thousands of such combinations, it's not practical to test each one. Knowing that each rendering engine has the same quirks, allows you to test in each engine, once or twice, and be reasonably sure the design will work everywhere.

Connecting to the Internet

While it's certainly easy to forget for many who live in urban / sub-urban areas, it's worth noting that fast connectivity to the internet is not a given. Many rural areas in our country still work on a dial-up only connection option.

The speed at which a web page loads is directly related to two factors: a) the connection speed of your internet connection and b) the rendering speed of your browser. Consider the following:

A user visits a website and wants to download a 50 MB Excel file. Since there are 8 bits in each byte, the total number of bits in that file is 50 x 1024 x 1024 x 8 = 419,430,400 bits. If you are on a standard 56k dial-up connection, that means you're downloading at 56,000 bits per second. 419,430,400 / 56,000 = 7,490 seconds or 124 minutes or just over 2 hours.

Now, take that same file, but download it on a campus connection that gets 20Mb / second (or 20,000,000 bits / second). That gives us 419,430,400 / 20,000,000 = about 21 seconds. Quite a difference.

Why is this important? Because, many times, people assume that visitors to your website will be on the fastest internet connection possible. However, as much as 25% of US households are on a dial-up connection (56k or slower). A site with 200k of information (HTML, images, etc) will load in a few seconds for people with high-speed internet (such as cable modem or DSL), however people on a 28.8k connection may wait over a minute. Market research shows that people loose interest in a page if it doesn't load within 10 seconds. All sites should be designed to load in 10 seconds or less on a 56k modem connection.

The table below shows some common internet connection speeds

Connection Type Speed
14.4 Dialup 14,400 bits / second (baud)
28.8 Dialup 28,800 bits / second (baud)
56k Dialup 56,000 bits / second (baud)
Standard DSL 768,000 bits / second
High Speed DSL & T1 1,500,000 bits / second
Cable Modem (slow) 5,000,000 bits / second
Cable Modem (average) 20,000,000 bits / second
Cable Modem (high) 60,000,000 bits / second
OC-48 48,000,000,000 bits / second
OC-96 96,000,000,000 bits / second (96 Gbps)

Note: There are faster internet connections available, but it is still faster to transport large amounts of data (PetaBytes) via mail than over the internet!!!

Search Engines

Search engines are special sites that were created specificially to help users find information on the internet. They typically work by using a "bot" (which is a script that is designed to read web pages and store the relevant information) which spider web sites (a process whereby the bot follows all the links on a given web page) then consolidate that information into an easy to use format for end users (you).

Popular search engines include:

It is important to note that search engines make every effort to detect and determine the best matches for a given query. Many go to great effort to ensure web developers do not "game" the system by creating bogus links and content. That said, there are some tricks that you can use to change what is displayed on the search results page - which we will discuss in later classes - that are based on how you create and use HTML on your page.

Other information

FTP - File Transfer Protocol

FTP is used to transfer files from one server to another. When sending data from a local computer to a remote computer, that is called "uploading", when sending data from a remote computer to a local computer, that is called "downloading". FTP is a standard in transferring data and files between computers. There are many applications out there to help you with transferring files using FTP, however, it is strongly recommended that you use sFTP (Secure FTP) or SCP (Secure Copy) to send data encrypted across the network.

Ping and tracert

The ping and tracert commands are network utilities to help you identify and troubleshoot network connectivity problems.

SSH - Secure Shell

SSH or Secure Shell, is a method of safely connecting to a remote server via a secure, encrypted connection.

Virii, Hackers and worms - oh my!

Virii and worms are computer snippets that are designed to do a function not intended by the user. While generally considered to be "bad", the intent of these isn't always to cause damage. Sure, a virus can be engineered to crash a hard drive or erase data, but more often than not, virii are designed to capture sensitive information and send it to a remote server for later use (i.e. capture usernames and passwords or credit card information). Some virii are simply scripts sitting on a computer which can be activated by a single master server to attack another server (Distributed Denial of Service or DDOS attack).

Yes, you don't want virii on your computer, and yes you should protect your computer with current anti-virus software.

Hacker vs. Cracker

The term "hacker" is technically defined as anyone who wants to figure out how a system works. Technically, these people have no malice towards anyone, they just want to see if they can get past security, or break a password, or find a vunerability in a system. Their intent is usually educational in nature (either for themselves or their victims) and typically does not usually involve an intent to destroy or cause harm. A "cracker" on the other hand, is a hacker with the intent to cause harm (take down a site, steal information, etc). The media often calls people who are "crackers", "hackers" incorrectly, and knowing the difference, while semantic, is important.

XHTML vs HTML

Taking a ride in the way-back-machine, we go back to the 1990s. It was a prosperous time. Companies were sending data back and forth across networks using .csv files and proprietary data formats, and the internet was in its infancy. AOL was the big "ISP" (internet service provider) of the day, and HTML was just starting to be used to format content for users. Netscape Gold was THE browser to have (and you had to pay $19.95 for it too!). Ah, those were the days.

But not all was rosy! Too many companies were placing their data in too many different formats - comma separated, tab delimited, pipe delimited, dbase, and so much more. While sending and recieving data was easy, creating and parsing it was becomming more and more difficult. Then, a group of developers decided to take another look at SGML. They liked some of what they say, but wanted something more flexible and extensible. So they created something called "XML" (eXtensible Markup Language). Finally, there was a way to define AND send data at the same time. And, since you could create your own format, customizing it was easy, so long as you followed a few simple rules.

Web developers, also suffering from limitations in the definitions of HTML (HTML was so loosely defined, creating a parsing engine got to be a cumbersome task), so they took the rules of XML and applied them to the tags used in HTML, and Frankenstein's monster - XHTML - was born! (crash of thunder goes here!).

This new language, gave web developers a standard set of rules to work with borrowed from XML, with a tag set already familiar to them in HTML.

What is an HTML / XHTML tag (element)?

An XHTML tag (or element, I'll use the terms interchangably) is simply an opening angle bracket (<) and closing angle bracket (>) surrounding the element name. For example:

<table> is used to create a table on a web page.

There are about 75 elements in XHTML, of which, an average web developer will only need to know about 30 - 40. That's it. This whole course is centered around you learning about 40 tags. Pretty slick, eh?

Now, there are some rules we need to follow, and some special tags. The first 4 tags we'll introduce are:

Opening and closing XHTML tags

Each tag in XHTML must have an opening tag (<strong>) and a closing tag (</strong>). Notice the slash at the beginning. The content you want displayed will go between the opening and closing tag.

So:
<strong>This is bold text</strong> will look like this:
This is bold text.

Tags must also be closed in the reverse order they are opened.

Correct:
<strong><em>Italic Font</em></strong>

Incorrect:
<strong><em>Italic Font</strong></em>

Finally, a few tags are what we call "self closing". An example of a self closing tag is this: <br />
Notice, this isn't <br></br>

Attributes

Our last topic this session will be attributes. Attributes are additional information you can have in a tag. Consider the following example:

<tag attribute="value" attribute="value">Some Content</tag>

A real example might be:
<table summary="sales for quarter 4" style="width:500px;">[TABLE DATA HERE]</table>

All this leads us to...

8 simple rules of XHTML!

  1. All pages must have a DOCTYPE.
  2. All pages must have 1 and only 1 html tag.
  3. All pages must have 1 and only 1 head and title tag.
  4. All pages must have 1 and only 1 body tag.
  5. All tags must be in lowercase.
  6. All tags must be closed (either with a closing tag or self closed).
  7. All attributes must be in lowercase.
  8. All tags must be properly nested (i.e. closed in reverse order of opening).

Suggested Videos

Note: These are just some videos I found on youtube that I feel are related to the topics covered here.



Ask questions.

Quick Links

Lecture Notes
Goals
Expectations
Introduction
History of the Internet
What's in a name?
Browsers
Connecting to the Internet
Search Engines
More info
Virii, Hackers and worms - oh my!
Introduction to XHTML
Suggested Videos

Helpful Pages

Homework / Labs / Quizzes

No homework or labs