Structure vs. Presentation

HTML started as a small language for merely marking up the structure of a document, leaving the browser to decide how the structures would be rendered. However, as the language grew in popularity, and as browsers deviated from the standard to add glitz and glam, HTML began to acquire tags for changing the visual presentation. The font tag, for example, could be used to set the face, color, and size of text. The u tag could be used to underline text, the strike tag to cross out text, and the center tag to align its child elements. These new tags had little too do with the structure of the document and a lot to do with its presentation.

Most of these presentation tags have since been removed from HTML. With HTML5, the standards committee began separating structure from presentation. HTML was restored as the language of structure. Presentation was to be expressed elsewhere, in stylesheets, a topic you'll read about soon.

When an HTML document is written so that the markup declares structure but not presentation, the document is said to be written in semantic HTML. Separating the semantic meaning of a tag from its visual appearance will likely take some practice. For example, when you see a ul tag, you automatically think of a bulleted list. In semantic HTML, however, a ul is merely a sequence of items. Its presentation will be decided by the stylesheet, and it may look nothing like your notion of a list when rendered. This is often the case for ul tags used to structure lists of navigation links.

Let's examine a few more HTML elements and discuss them through the lens of semantics.

Figure

To create an image with a caption, you could use an img element followed by a p element. But their interdependent structure would not be clear. Enter the figure element, an element that explicitly associates content with an explanatory caption:

html

preview

The text here is lorem ipsum text. Such placeholder text has been used by the typesetting industry for decades in promotional materials. The original Latin text comes from an essay written by Cicero, a Roman orator. The image comes from a website that offers similar randomness but for photographs.

Pre

Suppose you wish to share some Java code on a web page. When you insert the code directly into the HTML, the rendering is probably not what you had in mind:

html

preview

There are no tags in the HTML. Without markup, the code has the same semantic meaning as plain text. The browser collapses the whitespace and renders it like any other flowing text.

The pre tag can be used to communicate that content is preformatted:

html

preview

The semantic meaning of a pre element is that the whitespace in the content is significant. How the content is displayed is still left up to the browser. Most browsers render preformatted elements using a monospace font.

Break

Inside a preformatted element, all whitespace is considered significant. Sometimes you only want a significant linebreak, as in poetry. Without any markup, the linebreaks are lost in this limerick:

html

preview

Putting this text in a preformatted element would fix the rendering, but it isn't semantically appropriate if you want to make just linebreaks significant. For that, you can use the break element, whose tag is <br>:

html

preview

Break is a void element. It has no children and no closing tag.

If you are using multiple break elements to add padding between elements, and not to mark structural linebreaks, you are violating the element's semantic meaning. Padding must be added not through HTML but through a stylesheet.

Table

Suppose you have this US census data that you would like to put in a document:

html

preview

The data is unstructured and hard to interpret. It belongs in a table element. The immediate children of a table are table row elements, which are marked with the tag tr. Each row is broken in table data elements, which are marked with the tag td:

html

preview

The semantic structure is clear, and the browser also happens to display the table in a readable grid. The semantic meaning can be made even more clear by prepending a row of table heading cells:

html

preview

The rows are further distinguished using the thead and tbody tags.

When presentational HTML was fashionable, table elements were used to define the overall layout of a page. In semantic HTML, tables are not used for layout. A table element is used only to structure tabular data.