This is a summary write-up of material and concepts covered in Day 2 of the Spiders at Work Web Camp. On day 2, we talked about more advanced HTML.
On day 1, we talked about the structure of an HTML document, and a handful of basic tags for presenting basic markup. These included:
<p><em> / <i><strong> / <b><font> (size, face, color )<br><h1> ... <h6><ul> and <ol><img><a>Please be sure you are comfortable with the use of these tags before continuing.
You will soon discover that as you write your content, your document will grow vertically, as you would expect. And then, it's only a matter of time before you start wondering about doing layout; for example, putting things next to each other. What if you want to put a paragraph next to an image? You might first try something like this:
<img src="my_image.gif" alt="My image"><p>Here is a paragraph describing this wonderful image. It truly is a wonderful image, don't you think? I sure think so.</p>
But this gives you:
Here is a paragraph describing this wonderful image. It truly is a wonderful image, don't you think? I sure think so.
...which probably isn't what you wanted. Maybe then we try putting the image at the beginning of the paragraph, as in:
<p><img src="my_image.gif" alt="My image">Here is a paragraph describing this wonderful image. It truly is a wonderful image, don't you think? I sure think so.</p>
Which gives you:
Here is a paragraph describing this wonderful image. It truly is a
wonderful image, don't you think? I sure think so.
This is a little closer, but we probably want the top edges to be vertically aligned, so this method isn't right either.
This brings us to the emotionally-charged and infinitely-abusable notion of using tables for layout. Before we head down that dangerous path, it's worthwhile introducing tables for what they were originally intended for: the tabular representation of data, spreadsheet-style.
Here is a sample table:
<table border="1"><tr><th>Team</th><th>Wins</th><th>Losses</th></tr><tr><td>Yankees</td><td>60</td><td>49</td></tr><tr><td>Red Sox</td><td>56</td><td>50</td></tr><tr><td>Blue Jays</td><td>51</td><td>57</td></tr></table>
It renders like this:
| Team | Wins | Losses |
|---|---|---|
| Yankees | 60 | 49 |
| Red Sox | 56 | 50 |
| Blue Jays | 51 | 57 |
(Please note that the data in the table above was made up, in painful recognition of reality, by a rabid and perpetually-disappointed Yankee-hating Red Sox fan.)
The best way to think of tables is as stacks of horizontal rows of data
contained by a <table> tag. Rows are broken down into
one or more cells, which are the individual boxes shown above. The tags
relating to tables are:
<table>
The table tag contains the entire table. It has a few useful attributes, including "width", which may be specified in pixels or as a percentage (like "75%", relative to the width of the browser window) to specify the width of the table, and "border", which gives the width in pixels for borders around each cell.
<tr>
The tr tag specifies a table row, which is a horizontal row of cells. It is important that all rows contain the same number of cells, or strange things will happen.
<th>
The th tag specifies a table header, which is a kind of cell that is intended for the top row of a column of data. This tag is typically rendered as centered and bold, but that could vary. Its main purpose is to separate the description of the data in the table from the data itself. In all other ways it behaves exacrly as a td, table cell, described below.
<td>
The td tag specifies a chunk of table data, or a single cell. The td tag can also have a "width" attribute, as well as "valign" and "align" attributes (with allowable values of "top"/"middle"/"bottom" and "left"/"middle"/"right" respectively) that specify the vertical and horizontal alignment of the data in the cell.
If not given "width" attributes, <table> and <td> will constrict as tightly as possible around the data
inside them, making the table as visually small as possible for purposes of
efficiency.
Tables were not part of the original, first version of HTML; they were added in version 3 (or was it 2?) of the language. It didn't take long for HTML authors to figure out that tables could be (ab)used for layout purposes.
Tables are now almost always used to effect page layout, rather than to represent tabular data. In other words, it was figured out early on that you could put images in table cells, and bingo! You've got layout control. Consider the following example:
<table><tr><td><img src="my_image.gif" alt="my image"></td><td><p>Here is a paragraph describing this wonderful image. It truly is a wonderful image, don't you think? I sure think so.</p></td></tr></table>
![]() |
Here is a paragraph describing this wonderful image. It truly is a wonderful image, don't you think? I sure think so. |
At last count, roughly fifteen zillion people use tables for layout on their websites.
So what? What's the problem with doing that? Well, it's really only of concern to purists (like this author): using tables for layout contributes to the problem of mixing form and content, discussed in the day 1 writeup. It uses a tag meaning to introduce a logical structure into your document to achieve a visual effect. People who are visually disabled have no easy way of knowing whether a table their speech-enabled browser is describing to them is going to present them with data, or is used to put things next to each other.
Unfortunately, the alternatives to using tables for layout are still very
few, so most people grumble and go along with it since there's little
choice. In fact, this author thinks that the grid method for layout via
tables is a fine way to do things; he just wishes there were tags for it
that did the same things but that were called things like <layout-grid> instead of <table>, <grid-row> instead of <tr>, <grid-cell> instead of <td>, and so forth. But there aren't - at
least not until XML is widely used and rendered correctly,
which is a few years off still.
If you look at the source for almost any web page out there, you will find
pages consisting almost entirely of tables, often nested several levels
deep. Table cells (<td>) can contain entire tables, as in
the following example:
<table border="2"><tr><th>Col 1</th><th>Col 2</th><th>Col 3</th></tr><tr><td>some data</td><td>some data</td><td>some data</td></tr><tr><td>some data</td><td><table border="1"><tr><td>topleft</td><td>topright</td></tr><tr><td>bottomleft</td><td>bottomright</td></tr></table></td><td>some data</td></tr><tr><td>some data</td><td>some data</td><td>some data</td></tr></table>
| Col 1 | Col 2 | Col 3 | ||||
|---|---|---|---|---|---|---|
| some data | some data | some data | ||||
| some data |
| some data | ||||
| some data | some data | some data |
This is a real headache to maintain after a while, so be warned: if you use tables for layout, be prepared to put a good chunk of effort into keeping them as simple as you can.
One more warning about tables: you must be sure to
properly balance and close your tags. If you forget to close your table
with </table>, your table might not be shown at all!
Netscape is particularly likely to not show a table with structural errors
in it. See the discussion of validation, below.
There's been a fair amount of discussion so far regarding the mixing of form and content, and the difficulties that result from this entanglement. How is it to be done properly?
The answer -- kind of -- is Cascading Style Sheets, which became part of the official HTML specification with version 4.0. "Kind of" because although they were, and are, an elegant solution to the problem of separating form and content in HTML, they were, and are, very unevenly supported by most browsers as of this writing (September 2000). Although the specification for Cascading Stylesheets (CSS) is very clear and precise, most browsers fail - and often spectacularly - when trying to render pages that use them.
It turns out that there is a small subset of CSS that works pretty reliably on most of the "4.0" browers, or the 4.0 and higher versions of Netscape Navigator and Internet Explorer. Full compliance has yet to be reached by any browser, although Internet Explorer version 5.0 for Macintosh comes the closest (the Windows version is a little off, and version 5.5 is actually less compliant). The much-ballyhooed, eagerly-awaited, and very-late open-source Mozilla browser promises 100% compliance when it's finally delivered, but it's still at least a few months off.
This means that if you're writing with Cascading Stylesheets to control the look of your documents, you must be aware that people using older versions of the popular browsers will have difficulty seeing the documents as you write them. At some point, probably by 2002 or 2003, authors will be able to reasonably count on browsers correctly rendering CSS; until then, it's touch-and-go.
This site, for example, is being written with as-of-this-writing bleeding-edge XHTML 1.0, which also uses CSS for formatting. Probably, most of you reading it in late 2000 are not seeing it correctly, although it should be perfectly readable. Want to test your browser to see if it renders CSS correctly? Take a look at this image first, and then look at this page. They should be identical. If they aren't, your browser cannot handle CSS and correct HTML. But don't feel bad; most can't.
Three reasons:
Furthermore, there's an important concept called graceful degredation that ensures that when you write
HTML with CSS, users with browsers that can't handle it will at least be
able to render the document well enough to be read. In a nutshell, graceful
degredation means to stick to the logical tags to markup your text as you
want to, and if the styles can't be rendered by the browser, the default
styles will kick in. For example, if you want to change what level 1
headers look like, then define styles for the <h1>
element. Don't use styles just to format ordinary text to be big and
however else you want it to look. If you rely on styles alone without the
logical structure, then browsers unable to render the styles will have no
logical structure to fall back on.
But we're getting ahead of ourselves; first, we need to look at what CSS is and how it works.
Cascading Stylesheets achieve the separation of form from content by defining (and optionally naming) styles that may be abitrarily attached to tags via the new "style" or "class" attributes. A very simple example looks like this:
<p style="color: yellow; background: black; font-size: 1.5em">This is large yellow text on a black background</p>
This is large yellow text on a black background
Note: This and all subsequent examples are rendering real CSS on the page, as is the case with the whole document. If your browser does not render CSS correctly, then you might not see the full effect! We'll stick to CSS that's likely to be rendered correctly on most modern browsers.
In the example above, we see the familiar <p> tag, but
with a new attribute: a long "style" attribute. This is CSS in action!
The "style" attribute contains information about how the text within the <p> tag should look. In this case, we're specifying three
CSS attributes: color, background, and font-size. The colors are
straightforward enough; the font-size element is set to 1.5 "em", where
"em" is the current font size. This means increase the current font-size
by a factor of 1.5 (a 50% increase) for this element only.
Having styles on all of your tags isn't really much of an improvement over
older ways of doing things, though; you still have style (form) information
scattered through your document (content). That's what we're trying to get
away from. The real strength of CSS lies in its ability to define and name
styles that can be attached to elements. You can change the way all of
your <p> tags look at once, for example, by defining a
style for <p> at the start of your document, in a
<style> tag within the <head> section.
This is best shown by example. The following is a sample
<head> section of an HTML page defining styles:
<head><title>My Stylish Document</title><style type="text/css"><!-- p { color: yellow; background: black; font-size: 1.5em; } --></style></head>
Since there's a fair amount here that's new, let's look it over line-by-line.
<head>
We've seen this; it's the opening of the <head> section
of the document, before the <body> section starts.
<title>My Stylish Document</title>
The standard <title> element for the document, with the
name of the web page as it should be displayed in the browser's title bar.
<style type="text/css">
This is new: the <style> tag, which marks the opening of
a section defining CSS styles. It is marked as CSS by the "type"
attribute, whose value is "text/css"; there are actually other kinds of
stylesheets besides CSS, but most of them are theoretical. Just copy this
line as it stands to be safe.
This is the beginning of an HTML comment, which means hide everything
between it and the closing -->. This is purely for backwards
compatibility with older browsers that don't understand the <style> tag; browsers are instructed to ignore tags they don't
recognize but try to display the content as best they can. Putting the
style definitions in an HTML comment like this ensures that older browsers
won't display the style definitions in the document, since they don't know
what else to do with it. Clever, eh?
This is a style definition. Since it starts in the left-hand column with a
"p", it means we're defining a style for the document that should be used
for every <p> tag.
The style data is contained between { curly-brackets }; each style attribute is given in the form name: value, and they are separated by semi-colons. Styles do not have to be combined into one line like this; as long as they are properly enclosed by the { curly-brackets } and separated by semi-colons, they can span multiple lines.
The color attribute specifies the color of the element's data (for a paragraph, that means the text); the background is the element's background color; the font-size attribute specifies the size of the text within the style, in this case making it 1.5 times larger than it would normally be.
This closes the comment started above, after all of the styles have been defined (in this case, just one).
</style>
This closes the <style> tag.
</head>
This closes out the <head> tag. The document would continue at this point
with a <body> tag and the document's content.
This is pretty nice; with this definition, every <p> tag
in the document will inherit the style described in the document's <style> tag. You don't need to do or say anything; plain old <p> tags will now have big yellow text on black backgrounds.
If you decide that you want red text instead of yellow, all you have to
do is change the style definition. Even if you have 100 paragraphs (especially if you have 100 paragraphs!), they all instantly
inherit the new style. The separation of form from content is complete
here, because we've separated what the data is (it's a
paragraph, as indicated by the <p> tag) from what it looks like (the style information is in one place, in the
header of the document).
It gets better, though. The "Cascade" part of Cascading Stylesheets is
very important and powerful, and it kicks in like this: any style can be
overridden by more specific styles as needed. Every element inherits the
styles of every element that contains it, starting at the site-wide level,
proceeding to the document level, and then into the cascade of block and
inline tags (see the next section). For example, we have defined a style
for <p> tags for the whole document in the example
above. We could still apply styles to individual <p> tags within the document itself, and those styles, being
further down the cascade, would override the document-level ones. So if we
wanted to have a single paragraph with green text, we could just write
this:
<p style="color: green">This paragraph will have green text on a black background.</p>
The paragraph above inherits the big-yellow-black background style from the
document-level specification in the <style> tag; then the
tag-level "color: green" style kicks in and overrides just the "color" part
of the document-level specification, leaving us with big-green-black
background.
It gets better still: it's possible to define styles and not attach them to any particular element, but have them available for use wherever it's appropriate. These are called class selectors, and have a similar syntax to style declarations, except that they start with a period and a name, rather than a tag name, like so:
.urgent { font-weight: bold; color: black; background: red; }
You can then attach this style to any element, giving it a "class" of "urgent", like so:
<p class="urgent">This paragraph is saying something really urgent.</p><p>This is a normal paragraph<span class="urgent">with an urgent section</span>.</p>
This paragraph is saying something really urgent.
This is a normal paragraph with an urgent section.
Notice how for the first paragraph, the whole paragraph is done in the
"urgent" style; in the second, only the phrase surrounded by the <span> tag is marked with that style.
Note: Avoid the temptation to name your styles after what they look like. For example, you might initially think to name the style in the above example something like "boldred". This is a bad idea because we're back into the form-content muddle again; the whole point of having styles is to separate your form from your content so that you can easily change the form. If you decide, after writing 20 pages of HTML, that the bold red is too intense, and you want to go with italic green instead, you could change the style easily enough, but it would still be called "boldred", which doesn't make sense for something italic and green. Name your styles after their function, not their appearance.
To make things even more interesting, you can define styles within styles,
to match elements that occur only in other elements. For example, in this
document, <code> tags have a style of "color: purple;
background: white". <pre> tags have a style of
"background: #CCCCCC" (which defines a light grey background). If I put a
<code> tag inside a <pre> section, the
white background of the code section overrides the grey background of the
pre section, which is ugly and not at all what I want.
To fix this, you can define styles that say "apply this style to all code tags within pre tags". The definition looks like this:
pre code { background: #CCCCCC; }
This sets the background for <code> tags inside <pre> tags to #CCCCCC,
the same value as the style for <pre> tags.
Finally, perhaps the greatest power of Cascading Style Sheets is the
ability to keep all of your styles in one central document and have all of
your pages refer to it via a <link> tag in the header, so
your whole site can share styles without having to describe them repeatedly
in each document's <style> section. You can then make a
change in your central stylesheet document, and every document in your site
that links to it will be instantly updated. Stylesheet documents are
typically named with a ".css" extension, and are referred to like this:
<link rel='STYLESHEET' type='text/css' href='my_styles.css'>
The <link> tag goes in the document's <head> section, and should precede the <style> tag if you have one, so that the document's declared styles can
correctly override the stylesheet's if necessary.
We've really done a whirlwind tour here, and haven't gone into great detail on any of the points. That's far beyond the scope of this training and this document. I can specifically recommend one book:
Cascading Style Sheets: The Definitive Guide, by Eric A. Meyer, published by O'Reilly.
It's the best treatment I know of on the whole subjects, including the limitations of current browsers, bugs in current implementations, and using stylesheets in far more advanced ways than it's been possible to show here. It's a long road, but extremely worthwhile. Enjoy!
One other point it's important to touch on regarding HTML structure is the different between block and inline elements. There are some problems you'll run into if you aren't clear on the differences.
The actual precise technical definitions are extremely complicated, but the rule-of-thumb versions that will get you by are much simpler, and they run like this:
Block elements are typically rendered with one blank line between them, and may contain other block elements and inline elements. Examples include paragraphs, headers, lists, and tables.
Inline elements are typically rendered with no vertical space between them, and may only exist within block elements, and can only contain other inline elements. They cannot contain block elements. Examples include images, line breaks, links, and bold/italic formatting.
This means, for example, that the following HTML is wrong:
<b><p>This paragraph will be bold.</p></b>
It's wrong (although most browsers will probably render it as a bold
paragraph) because you have a block tag (<p>) inside an
inline tag (<b>). You can't have a bold section include
a paragraph. That's against the rules. (What rules? See "validation" in
the next section.) A paragraph can contain a bold section, though.
The correct sequence would be:
<p><b>This paragraph will be bold.</b></p>
More importantly, it means you can't have something like an image just hanging out between paragraphs, because it's an inline element. It has to be inside a container. So the following is wrong:
<p>A paragraph.</p><img src="my_image.gif" alt="my image"><p>Another paragraph.</p>
It would have to be:
<p>A paragraph.</p><p><img src="my_image.gif" alt="my image"></p><p>Another paragraph.</p>
This is important mostly because some of the errors you will encounter when validating your pages will be of this nature. So let's talk about validation now.
Validation is the universally-neglected process of making sure your HTML is syntactially valid, or correct. HTML, like all computer-based languages, has a syntax that must be adhered to, or wierd things happen. Valid HTML will display predictably on all browsers that can display correct HTML. Invalid HTML might display predictably, unpredictably, or not at all. If you want to make sure the content you're working so hard to get on the web will be accessible and readable as widely and broadly as possible, you should validate all of your pages to make sure they're correct.
You validate your pages by running them through a program called a validator. There are a variety of validators out there; one of the best is the W3C'S at http://validator.w3.org. It also allows you to upload pages directly from your computer for checking.
Be prepared! Virtually any page you upload will have errors in it - some will have hundreds. It can seem overwhelming; it's much easier to start with a valid structure and build up from there than to try to correct a document that has errors and has been worked on a lot.
Here are some of the more common errors you'll run into when validating:
Your document will not validate without a proper DOCTYPE declaration. That's because the various versions of HTML have different structures, tags and rules, and the validator has to know which version you're validating against. Here are some of them:
| HTML version | DOCTYPE declaration |
|---|---|
| HTML 2.0 | <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> |
| HTML 3.2 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> |
| HTML 4.01 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html40/strict.dtd"> |
| XHTML 1.0 | <?xml version="1.0" encoding="UTF-8"?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
|
Copy the doctypes as needed. Note that the XHTML doctype also includes an
extended <html> tag.
Beginning with XHTML 1.0, all tags must be closed. For example, many people
improperly use <p> as a separator rather than a
container, putting <p>'s between paragraphs instead of
putting paragraphs between <p> and </p>.
Although some (notably earlier) versions of HTML permit some tags not to be
closed, you should always close all of your tags. That way you don't have
to work to remember which ones need it and which don't. Close them all.
Everything should balance, except for tags that don't close: in HTML 4 and
earlier, the important ones are <img>, <br> and <hr>.
Also note that if you're writing XHTML, these tags do
close, as explained above, so your valid <br> tags from
HTML 4 must mow be written <br />.
Tags containing other tags must nest in the proper order. This is wrong:
<b><i>bold and italic</b></i>
This is correct:
<b><i>bold and italic</i></b>
For example, in HTML 3.2, the <font> tag is legal, but
the "face" attribute is not. It's a browser extension (invented by
Netscape, in this particular case - there are plenty of them) that may work
but are not really part of the language. In HTML 4.0, the <font> tag is illegal, no longer part of the language (and replaced by
stylesheets).
See the discussion above.
Beginning with XHTML 1.0 (which is the official current recommendation from the w3c, by the way; HTML 4 and earlier are now officially "old"), all tag attributes must be quoted. This is wrong:
<img src=myimage.gif height=80 width=60>
This is correct:
<img src="myimage.gif" height="80" width="60">Bad news for those of us who used upper-case tags like <IMG> (this author included): also starting with XHTML 1.0, all tags and their
attributes must be lower-case.
Personally, I liked upper-case tags because I felt they stood out more from the text of the document, but there it is. It's the new rule and we might as well start getting used to it. This means that instead of writing:
<IMG SRC="myimage.gif" HEIGHT="80" WIDTH="60">
You now write:
<img src="myimage.gif" height="80" width="60">
There it is. Actually, I'm kind of getting used to it already.
Maybe they do look fine on your browser. How do you know how they look on mine? I'm probably using a different browser and version, different fonts, different platform, different screen size, and different window size than you are. The little HTML error your browser is tolerating may be choking my browser. If you validate your HTML, you know it has no errors and it will render fine on my machine.
If two million people do a foolish thing, it is still a foolish thing.
- Opus, Bloom County
Giant Company X probably spends unthinkable amounts of money trying to detect your browser and sending you HTML that displays properly on your exact version. How they curse when a new version comes out! How they rant and rave when a new browser is released and they have to tune their website again for it! (Or how they don't, and just disregard market segments that can't render their pages.)
The Bobby validator, at http://www.cast.org/bobby/, checks your pages for accessiblity issues. It has lots of good information about writing HTML that is accessible to users with disabilities. If you are concerned about ADA compliance, you should validate with the Bobby validator.
You should also visit the Web Accessibility Initiative at http://www.w3.org/WAI/. The WAI, a project of the World Wide Web Consortium that created HTML, has its own validation/rating system which specifies three levels of accessibility: levels A, AA, and AAA. Level A is the miminal baseline for designing for users with disabilities; the Bobby validator corresponds approximately to this level. Level AA requires valid HTML and is increasingly required for projects that are federally funded; Bobby compliance, although an important start, does not necessarily meet federal guidelines for accessibility.
Does this mean that you can write invalid HTML and still be Bobby-compliant? Yes, although I'm not sure I'd see the point of doing so. Invalid HTML may also hinder Bobby's ability to check your pages for compliance. Plus, if you are interested in full accessibility for your pages, or if you are receiving federal funding and are required to reach ADA-levels of compliance (speaking generally; as far as I know, the ADA has not released official statements to date on the issue), then Bobby is a first step, but WAI level AA is what you should be meeting minimally, which does require valid HTML.
More practically, it's much easier to run your pages through Bobby and get a simple "everything's fine" then have to go through individual errors in HTML and decide case-by-case whether they affect usability.
The bottom line is that unless you have a very good reason not to (and this reason should be good enough to explain to a disabled user of your site), your site should contain valid, Bobby-compliant HTML.
I'm not going to name any of the point-and-click, WYSIWYG (What You See Is What You Get) HTML editors that are so rampantly popular. They hold out the promise of insulating users from the need to learn and write HTML, presenting a word processor-like interface in place of text and tags. Some word processors even offer a "Save as HTML" option for files, as the web continues to grow and expand.
There are several reasons why HTML editors are the bane of humanity and the web.
Editors with word-processor-like interfaces mislead authors into thinking they can micro-control the layout and appearance of their pages, that other people will see the pages exactly as they design them. Actually, all that can be assured is that most people will not see them as they are created; they will have different fonts, different browsers, different window and screen dimensions, and even different color palettes in many cases.
The web isn't WYSIWYG (What You See Is What You Get). Web pages should and will look different on different systems. Using a graphical editor naturally leads people to assume that others will see the pages as they look on their own screens, since that's ostensibly what happens with other document formats like word processor documents, which should be identical when viewed on different computers. So right off the bat, editors break one of the founding principles of HTML by offering graphical interfaces: HTML is a markup language, not a layout language.
Editors have an impossible job here, although they have made some progress in recent years. If there were a clean slate to start with, it's theoretically possible that editors could be created which would only generate valid HTML. However, there's so much broken and invalid HTML out there to begin with, and the editors need to be able to handle the bad stuff too, that it's impossible to create an editor that will always render valid HTML. You may quote me on that.
Many editors make egregious HTML errors routinely; some even break correct
HTML by changing tags. One I experimented with turned all of my <p> tags into <br> tags in an effort
to... well, I admit I have no idea what it was trying to do, other than
break my code.
Some do a better job these days. It's good that they're trying. But graphical editing methods don't lend themselves easily to translation into quality markup structures. Even if we one day get an editor that can create valid HTML from scratch (we will never get one that can take an invalid document and turn out a sensical, valid one), that doesn't mean that the HTML will be well-structured. The quality of HTML that editors generate continues to be atrocious.
Editors use logical structured tags to achieve visual effects based on what the popular browsers do. For example, begin a new paragraph in an editor. Hit "tab" a couple of times to indent and then type something. Then look at the HTML that gets generated to render that visual effect. I'll bet it isn't this:
<p style="text-indent: 2em">some text</p>
It's more likely to be something like this:
<dd><dd><dd>some text<br><br>
...or something else, but I've seen plenty of things like this. What's
going on in this particular case? Browsers tend to render the <dd> tag with a short left indent; nested versions continue to
indent. It's utterly nonsensical - three consecutive definition terms, two
empty, with some text that isn't a definition? It may achieve the visual
effect of indenting, but at the cost of the utter destruction of the
logical structure. It's nonsense.
The reason editors have a hard time with this is that they have no idea why you want to indent your text. They don't know your meaning or your intent; all they know is that you want to indent for some reason. They can't know why you want to do what you're doing, so they can't choose the proper manner to render it with logical markup; they can only try to achieve the visual effect you want with whatever's easiest in terms of physical markup. There goes your logical structure.
There may at some point be an editor that lets you define snippets of HTML and CSS, name them, describe them (and maybe even modify them) in some kind of language that the editor can understand, and capture the nuances of your intentions. Maybe. Someday. I'd say that the amount of time it would take to learn such a system would be about twenty times the amount of time it takes to just learn HTML in the first place.
My guitar teacher frowned at me when I excitedly brought in my new electric tuner; I could just plug it into my guitar, pluck the strings, and watch the needle as I adjusted the pegs. He said, "Don't you want to know how to tune your guitar? What if you leave that thing at home some night? What if you run out of batteries? What if what it's telling you is wrong?"
Sheepishly, I traded it for a new guitar and went back to tuning the old-fashioned way. Actually, I don't miss it anymore.
The obvious argument here, which I won't contradict, is "Yes, but it's good to have a tuner in your bag anyway in case you need to do a quick-and-dirty tuning in a loud place," the application here being "Yes, but sometimes I just need to lay out a quick page and it's so much easier with an editor." Sure, that's fine. Just have the skills to look at the HTML generated by your editor and fix it when necessary.
OK, clearly this is all a matter of opinion. My opinions on these matters are admittedly way out there at one end of the spectrum. It's hard being a purist in a complex world.
The truth is, most people do use graphical HTML editors. I personally think they cause far, far more problems than they solve. I think the current messy state of the web and HTML is due in large part to the use of these editors; bad HTML begets browsers with the ability to display bad HTML which begets worse HTML and so forth.
HTML is not a terribly difficult language as languages go. It has some nuances which are important to understand and which cannot be handled by graphical editors: the importance of the separation of form and content; the nature of HTML as a markup language and not a layout language. Even under the best of circumstances, editors mangle these concepts; at worst, they generate invalid HTML that displays unpredictably, disadvantages users with disabilities whose browsers now have the added challenge of trying to disentangle the meaning intended behind the non-structural markup created purely for visual effect, and serve to needlessly distance authors from the language they're writing in.
Bottom line: use an editor if it improves your work. But be sure you understand what's happening under the hood as well.
In this section, we covered some more advanced HTML relating to how your site looks, and quality and accessibility issues. Next in the day 3 writeup is multimedia - images, sound and video.