Whatever you call them — blocks, boxes, areas, regions — we’ve been dividing our Web pages into visible sections for well over a decade. The problem is, we’ve never had the right tools to do so. While our interfaces look all the world like grids, the underlying structure has been cobbled together from numbered headings and unsemantic helper elements; an unbridled stream of content at odds with its own box-like appearance.
Because we can make our <div>
s look but not behave like sections, the experience for assistive technology (AT) users and data-mining software is quite different from the experience enjoyed by those gifted with sight.
Now that HTML5 has finally made sectioning elements available, many of us greet them with great reluctance. Why? Partly, because we’re a community which is deceptively resistant to change, but also because of some perceived discrepancies regarding advice in the specification. In truth, the advice is sound and the algorithm for sectioning is actually easier to use than previous implementations. Some developers are just very married to their old workflow, and they think you should be too. There's no good reason why.
Make no mistake: Sectioning elements help you improve document structure, and they're in the spec’ to stay. Once and for all, I will be exploring the problems these elements solve, the opportunities they offer and their important but misunderstood contribution to the semantic Web. If you’re unfamiliar with the concept of the “semantic Web,” this video is a great introduction.
Making Websites
My introduction to Web design was via a university course module called something like “2.1: Dreamweaver,” and I recall my first website well. I remember my deliberately garish choice of Web-safe colors. I remember it looking right only in Netscape Navigator. Most of all, I remember hours of frustration from tugging at the perimeter of a visual layout tool named “table.” I had no idea at the time that this layout tool represented a type of annotation called an HTML tag. Furthermore, no one told me that this annotation invited my patchwork of primary colors and compressed JPEGs to be computed as a sort of demented Excel spreadsheet. In other words, I had no idea I was doing it wrong.
Multiple-choice answers:
- 1
- 2
- 3
- 4
The correct answer is (b), 2. We have included just one of HTML5’s new sectioning elements in the form of an <aside>
. Because <footer>
s and <header>
s are not sectioning elements, what does that leave us with? The <body>
tag is the outermost element, making the document itself a kind of section (a supersection, to be precise). So, there you have it: We’ve been using “sectioning” since HTML 1.0, just not with any subsections to speak of.
Some of you may have missed the clue earlier in this article and thought that <header>
and <footer>
were sectioning elements. Don’t fret; it’s not your fault. Whenever developers like myself try to explain HTML5 page structure, they usually brandish a diagram like the one I used above. In these diagrams, the boxes marked “header,” “aside” and “footer” exist in the same visual paradigm and occupy a similar area. They seem alike, you might say. The other culprit for this endemic confusion is the way the specification is written. Believe it or not, the document structure of some pages in the specification that refer to document structure is structurally unclear! This sort of thing sometimes happens when a standard is constantly evolving. The navigation tree for “4.4 Sections” found in this draft is laid out like so:
- 4.4 Sections
- 4.4.1
body
- 4.4.2
nav
- 4.4.3
article
- 4.4.4
aside
- 4.4.5
h1
,h2
,h3
,h4
,h5
andh6
- 4.4.6
hgroup
- 4.4.7
header
- 4.4.8
footer
- 4.4.9
address
- 4.4.1
You’d be forgiven for thinking that anything in this list qualifies as a sectioning element, absurd as some of them (<address>
?) may sound. It’s only when you navigate to 4.4 Sections > 4.4.8 Footer that you’re told that “the footer element is not sectioning content; it doesn’t introduce a new section.” Thanks!
Despite these ambiguities in the spec’ itself, as well as in the surrounding publicity for HTML5, sectioning in practice just works. The following three axioms are probably all you’ll need to understand the algorithm:
<body>
is the first section;<article>
,<section>
,<nav>
and<aside>
make subsections;- Subsections may contain more sections (subsections)
Aside from a few trifling details, that’s it. In a little while I'll cover the completely unnecessary worry that is had over headings combined with sections. For now, let’s take another look at that example from before about footer ownership. This time, I’ll make a few HTML5 substitutions:
The outline for this example looks like this:
- Document
- Article
- Aside
- Article
Now that we’ve implemented sections, the boundaries are clear. Our document contains an article, which, in turn, contains an aside. There are three sections, each belonging to the last, and the depth of each section is reflected in the outline. Importantly, because sectioning elements wrap their contents, we know perfectly well where they end, as well as where they begin. And yes — screen readers like JAWS actually announce the end of sections like these! We know what content belongs to what, which makes deducing the purpose of the footer much easier. Because it exists outside the bounds of both the <article>
and its <aside>
, it must be the document’s footer. Here’s the same diagram again, with subsections faded out:
The power of sectioning lies in its ability to prescribe clearly defined boundaries, resulting in a more modular document hierarchy. The footer unequivocally belongs within the immediate scope of the highest-level section, giving assistive technologies and indexing parsers a good idea of its scope, which helps to make sense of the page’s overall structure.
Headings And Accessibility
When Sir Tim Berners-Lee conceived the <section>
element all the way back in 1991, he envisioned the obsolescence of ranked heading levels. The thrust of the idea was that headings should act as mere labels for blocks of content, and the nature (i.e. the importance, scope, etc.) of the content would be calculated automatically based on the content’s standing in the document.
I would in fact prefer, instead of <h1>, <h2> etc for headings [those come from the AAP DTD] to have a nestable <section>..</section> element, and a generic <h>..</h> which at any level within the sections would produce the required level of heading.
Why is this preferable? Determining heading level systemically, based on nesting level, is much more dependable because it removes a layer of decision-making: By “producing” the required heading level automatically, we no longer have to decide separately which numbered heading we should include. It effectively prevents us from choosing the wrong heading level, which would be bad for parsable structure. A subsection must be subject to its parent section. Because this relationship between sections determines “level,” numbered headings are made redundant — hence, the proposed <h>
.
A lot of fuss over nothing
Now, this is the supposedly tricky part; the part that causes all the consternation and gnashing of teeth. This is the part that caused Luke Stevens to write this diatribe, and prompted Roger Johansson into a state of uncharacteristic apoplexy, asking, “are you confused too?”. Ready?
In the WHATWG specification (in the same place where <footer>
s were ostensibly classified as sectioning elements!), we are “strongly encouraged to either use only h1 elements, or to use elements of the appropriate rank for the section's nesting level.” On first appearance, this seems contrary. Surely only one of these courses of action can possibly be right? What do you do? I'm thinking maybe the first option. Or the second. Who am I?
It certainly confused me, so I spoke with HTML Editor, Ian Hickson. He explained the outline to me in detail and I'm convinced it is perfectly robust. I'm going to do my best to explain it to you here.
Okay. As it turns out, we didn’t get the generic <h>
element. This wouldn't be backwards compatible because older browsers wouldn't recognise it. However, headings that introduce sections are — regardless of their numbered level — treated as a generic <h>
. Quite correctly, it is the section itself that takes responsibility for nesting in these situations — not the heading — and whenever you introduce a new section, you introduce a new nesting level without fail. What does this mean in practice? It means that we can introduce and benefit from the structural clarification offered by sections without abandoning heading levels. Take the following example:
<h4>Page heading</h4>
<p>Introductory paragraph...</p>
<section>
<h3>Section heading</h3>
<p>some content...</p>
<h2>Subheading</h2>
<p>content following subheading...</p>
<section>
<h1>Sub-subheading</h1>
<p>content two levels deep...</p>
</section>
</section>
<h5>Another heading</h5>
<p>Continued content...</p>
Our heading levels are all over the place. This is not recommended by the specification, but it helps demonstrate just how robust the HTML5 outlining algorithm really is. If we replace all the headings that open sections with a generic ("wildcard", if you prefer) <h>
, things become clearer:
<h>Page heading</h>
<p>Introductory paragraph...</p>
<section>
<h>Section heading</h>
<p>some content...</p>
<h2>Subheading</h2>
<p>content following subheading...</p>
<section>
<h>Sub-subheading</h>
<p>content two levels deep...</p>
</section>
</section>
<h5>Another heading</h5>
<p>Continued content...</p>
It’s important to note that the only errors revealed in the computed outline are ones relating to badly ordered numbered headings within the same section. In the original example, you’ll see that I've followed an <h3>
with an <h2>
. Because they are in the wrong order, the outline interprets them as being on the same level. Had I encapsulated the <h2>
in <section>
, this error would have been suppressed.
Well, how about that? If you're not convinced, go ahead and paste my example into the test outliner and play around. It works just fine. In fact, it’s really difficult to break.
If you think there is a benefit to screen reader users, you may wish to adhere to the second of the two clauses from the specification and incorporate numbered headings that reflect nesting level. As demonstrated, this will have no effect on the outline, but since heading level (“Heading Level 2 - The Importance Of Sections”) is announced, it gives a clearer impression of structure to those who can't see boxes inside boxes.
The assertation that heading levels are perpetually indispensable to screen reader users comes under pressure when you consider advancements being made by screen reader vendors. Screen readers like JAWS mark the territory of sections more clearly than headings, by announcing the beginnings and ends of sections and the thematic regions they represent (“Article End!”). From this perspective, using more than one <h1>
s in your document might sometimes be applicable. You'll come up against some accessibility experts who are keen on their “there can only be one [h1]!” mantra, but research shows that even in HTML4 or XHTML, this is not necessarily the case.
The approach you choose is yours to make; just employ some common sense and consistency. Bear in mind, though, that not all screen readers are able to announce the bounds of sectioned content. In these cases, there are measures you can take …
ARIA Enhancement
Transition to an HTML5 document structure is made smoother by incorporating some ARIA landmark roles, which are both relatively well supported and somewhat analogous to the section-based navigation we should expect later. ARIA offers many more accessibility-specific features than baseline HTML5 could ever withstand; so, including “bolt-on” ARIA enhancements is certainly polite. However, regarding ARIA roles as a substitute for semantic HTML would be a grave misconception.
Landmark roles, such as role="contentinfo"
and role="banner"
, address accessibility only — not data mining — and each may be used only once per document. They are essentially shortcuts to parts of the page. HTML elements are more like building blocks, which are used in a repeated and modular fashion. So, while you can assist accessibility by placing role=”banner”
into the <header>
element closest to the document’s root, this does not preclude you from using <header>
to introduce other sections:
Are Sections The New <div>
s?
This is a common misconception.
If it wasn’t clear already, it should be clear to you now that <div>
s are semantically inert elements — elements that don’t really do or say anything. If this is clear, then it should also be clear that, when building a structured document, relying heavily on “an element of last resort” makes for a very poor foundation.
If the new <section>
element, for example, was just <div>
with a new name, adopting it would be a straightforward matter of search and replace. It wouldn’t exactly be progress, though. The truth is, <div>
still has a rightful place in the spec’; we’ve just given its organizational responsibilities to a team of elements that are better qualified. Sorry, <div>
, old mate. What do we use <div>
s for, then? Precisely what they were good at from the beginning: as a tool for “stylistic applications… when extant meaningful elements have exhausted their purpose.”
For instance, you shouldn’t employ sections as box-model controlling measures like this…
<section class="outer">
<section class="inner">
<h1>Section title</h1>
</section>
</section>
… because there’s nothing that the outer section does that the inner section doesn’t. We've created two sections for one piece of content. A quick run through our outliner throws the “Untitled Section” warning:
- [Untitled Section]
- Section title
The brilliance of <div>
in this context is that it refuses to affect the outline, which is why we can use it without fear of reprisal. This…
<section>
<div>
<h1>Section title</h1>
</div>
</section>
… averts disaster and results in this unsullied, if simplistic, outline:
- Section title
Sections And Semantics
A lot of developers have trouble with the word “semantic.” You might even say that they don’t know what the word means, which (if you are familiar with the term) makes an interesting paradox. For instance, when Jeffrey Zeldman advocates for the “semantic” application of the id
attribute, he’s kind of missing the point. The main purpose of semantic HTML is for the automated extraction of meaning from content. Applying a private, non-standard id
to a <div>
would not improve the semantics of the element one iota: Visitors can’t see it and parsers will ignore it. So much for the semantic Web!
Sections are often characterized as the “semantic” equivalent of <div>
. This is a half-truth at best, and I apologize for throwing the term “semantic” around so much — it’s become a bit of a shorthand. Some HTML elements are inherently semantic in that they prescribe specific meaning to their contents. The <address>
element is a good example: When a parser reaches <address>
, it knows that the contents should probably be interpreted as contact information. What it chooses to do with this knowledge is another matter, but it’s plausible that a screen reader could provide a shortcut to the address or a search engine could use it to refine its results pages.
Sectioning elements are not so much semantic as syntactic. All <section>
tells us is that it is a part of a whole. However, the syntactic contribution of sectioning elements to document structure is not unimportant. Consider the following sentence: If sections you don’t websites your are use obsolete. A lot of recognizable words are in there, but the lack of sensible syntax makes the sentence difficult to unpick. So it is with sectioning: You are not creating meaning so much as assembling it. Meaning isn't always about the "thing"; it's sometimes about what that thing's role is amongst other things.
Microdata
Efficient, syntactically sound data structures are worthless if they are semantically lacking. Fortunately, HTML5 has both angles covered and provides a mechanism for attaching semantic meta data, called “microdata,” to our structured content. Using microdata, and by consulting schema.org, you can define a page’s content as anything from a scholarly article to an exercise regimen. Unlike classes and IDs, this is information that can actually be interpreted usefully.
Conclusion
HTML isn’t just an SDK or a Graphic Designer's palette. It is a metalanguage, a language that tells you special information about information. Sometimes we — or, more precisely, the parsers we employ — benefit from added information about the subject, timing, origin or popularity of content. This is what APIs such as microdata and RDFa are for. Other times, the context, hierarchy, relative importance and codependence of the information are what need to be determined. This is where appropriate syntax, facilitated by sectioning elements, can be employed.
Some people will tell you not to bother with sectioning. They say that it’s hard work or that it doesn’t make sense. This is hokum. Sure, if you're lazy, don't bother with sectioning, but don't pretend you're doing it on principle. Using sections demonstrably enhances HTML structure without breaking accessibility. We've covered this.
Still, there will always be people who will attack this aspect of the specification. Perhaps we'll enjoy some of these objections in the comments:
- They will point to bad implementations by specific vendors:
These are bugs and bugs get fixed! - They will cite the actions of large websites who don't use sectioning elements:
Just because large sites haven't implemented sections doesn't mean they wouldn't like to. Since when does big mean 'right' anyway? - They will flood you with examples of developers implementing sections badly:
Some developers do stupid things and their misuse of HTML doesn't stop at sections. I include myself here, by the way. - They will present you with anecdotal evidence about user behavior within specific groups:
It is expensive and impractical to address problems on a case-by-case basis. Fragmentation and complexity would also be inevitable: a loss for the majority of users.
I don’t think anyone would advocate making badly structured Web documents any more than they’d suggest building a house by stuffing a bag full of bricks and throwing it into a ravine. The case has been made and the specification bears it out: Sections aren’t just good for document structure — they finally make proper structure attainable. Some browsers and screen readers have some catching up to do, that’s for sure, but the situation is improving rapidly. Any kind of change is a little turbulent, but this kind is worth it.