HyperText Markup Language (HTML) Basics

This document covers the basics of the HTML language used by Mosaic. The Internet offers more extensive descriptions of the language: HTML Primer from NCSA, and the HTML Reference from CERN.

HTML documents are stored as plain text files -- like those created by the Microsoft Notepad editor. Information in the HTML file is read by the Mosaic program directly from a file or it is sent to Mosaic by a Web server. Mosaic then formats the document on your screen using the particular font and window size preferences you have.

A very basic HTML document looks something like this:

    <title>Document Title Goes Here</title>
    <h1>A Level 1 Heading</h1>
    This is some sample HTML. 
    The paragraph will now end.
    <P>The next paragraph has started.

Notice that you instruct Mosaic how to format the text via the HTML commands enclosed between the < and > characters. These commands are called tags. Mosaic and other Web browsing programs use a wide variety of tags to format and structure documents. When a browser encounters a tag it doesn't understand it just skips over it. Tags are case-insensitive: the tags <title>, <Title>, and <TITLE> are equivalent.

With a few exceptions, each tag has two forms which "surround" the text being formatted. In the example above note how the text after the <title> tag is demarcated by </title>. The first tag is called the "opening" tag, while the same tag prefixed by a "/" character is called the "closing" tag.

Document Structure Tags

There are 4 basic document structure tags: Each document should have a title. Placing the title near the top of the HTML document is good practice as well. Enclose the document title between the <title> and </title> tags.

The HTML language has 6 levels of headings defined, so you can create hierarchical documents. Each level translates to a particular text rendering (font size, bolding, italics). Level 1 headings tend to have the most prominent rendering, with subsequent levels getting smaller or less prominent. Enclose your heading text between the heading tags (for example, <h1>Heading Text</h1>, <h2>Subheading</h2>, and so on).

Paragraphs are one of the exceptions to the "open/close" tag rule. Insert the paragraph tag <p> any time you wish to break to a new paragraph.

Pre-formatted text is used when you have an existing text file or other text element and you don't want Mosaic to format it. Enclosing the text with the <pre> and </pre> tags will force Mosaic to preserve whatever formatting you have in the file. A fixed pitch font such as Courier is often used for this purpose.

List Tags

Lists are another form of document structuring. There are three basic types of lists: unordered, ordered, and description.

An unordered list will produce a set of bullets for each list item. List items, like paragraphs, use a single tag, <li>, to denote the start of a new item. For example:

    <ul>
      <li> Earth
      <li> Water
      <li> Fire
    </ul>

is rendered by Mosaic as:

Unordered lists are enclosed by the <ul> and </ul> tags, ordered lists are enclosed with the <ol> and </ol> tags. If you need multiple levels of lists, simply start a new list inside an existing one. Mosaic will indent and format it appropriately.

Description lists consist of alternating titles and text. An example use of this list style is a glossary. Like other lists a tag is used to designate a new item, but description lists use two tags: <dt> is used to introduce a description title and <dd> is used to start a description section. The entire list is enclosed by the <dl> and </dl> tags. Here is a sample description list:

   <dl>
      <dt>Earth
      <dd>The land, the soil.  What we stand on.
      <dt>Water
      <dd>The oceans, the lakes and streams.
      <dt>Fire
      <dd>Bright and hot.  Ideal for cooking food.
   </dl>
and this is how Mosaic formats it:
Earth
The land, the soil. What we stand on.
Water
The oceans, the lakes and streams.
Fire
Bright and hot. Ideal for cooking food.

Text Emphasis Tags

A variety of text emphasis tags are available. These are commonly used within paragraphs to bring attention to some passage. The most common emphasis tags are <b>bold</b>, <i>italics</i>, and <tt>fixed text</tt>.

Hypertext Links - linking to other documents

One of the most powerful features of Mosaic and the World Wide Web is the ability to link your document to other documents in the Web. These hypertext links (also known as hyperlinks, links, or anchors) are normally highlighted by Mosaic in a unique color from normal text. When the user of the Mosaic program clicks on the link, Mosaic will retrieve and display the document associated with the link.

The link consists of 3 parts: the opening anchor tag, the phrase to be highlighted, and the closing anchor tag. Here is a sample hypertext link:

    <a href="pw_init.htm">Initial Page</a>
This will highlight the phrase "Initial Page". When the user clicks on it, the pw_init.htm file will be read in and displayed.

The "href" portion of the anchor is a Uniform Resource Locator (URL), so it can point to local files, remote Web servers, FTP archives, and more. The syntax for a URL is:

scheme://host.domain[:port]/path/filename

where scheme is one of

The host and domain parts are used when you use the http or ftp schemes, and refer to the Internet host name where the information resides. The use of port numbers is infrequent, so don't specify one unless someone explicitly instructs you to do so.

A common and recommended practice is to use what are called relative URLs whenever possible. Relative URLs are much shorter that a complete (or absolute) URL. A relative URL refers to another document that is stored in the same place that the current document was retrieved from. Relative URLs have many advantages, the primary one is that you can move a set of documents to a new location without having to modify all the anchors in the HTML files.

Some sample URLs:

Inline Images

Another special feature of Mosaic is the ability to include GIF images inside hypertext documents. The IMG tag is used to do this:
    <img src="logo.gif">
The text in quotes after the src= is a URL, so you can reference GIF files that reside anywhere in the Web to be included in your document.

Special Characters

HTML places special significance on three characters in the ASCII character set: the left angle bracket (<), the right angle bracket (>), and the ampersand (&). If you include them in an HTML file by themselves, you may not get the results you were expecting.

If you want to use these characters in your HTML text you have to use an alternate syntax which HTML will recognize and then substitute the appropriate character when it formats the document.

There are many other special characters that have codes in HTML. A few are: á (&aacute), ñ (&ntilde;), ç (&ccedil;) and ö (&ouml;).