I’ve been studying HTML since 1999, and like everything else it has grown and advanced over time, and I’m still learning new things you can do with it all the time. Unfortunately it tends to be one of the under-appreciated parts of web development, especially for those coming from a background of backend development. I want to explain some of the major points about why it is worth all developers time to fully understand it in order to have a more effective front-end.
- The more DOM nodes you have, the longer it takes to download that page from the server. HTML usually isn’t responsible for the major chunk of page load time, that award tends to go to images and scripts, but every little bit counts!
- CSS has to parse the entire DOM to figure out where it needs to apply the styles, and of course CSS performance in itself is a huge topic we won’t get into. The point is that the less unnecessary HTML elements on the page, the less work the stylesheets have to do, which results in better performance.
Every time I am working heavily on the HTML of a particular page for the first time, the first thing I’ll do is go through and rip out all the superfluous elements. I clean house, so to speak. So now you’re probably asking, “What are all these superfluous elements you keep talking about?” Basically, I’m talking about any element that does not need to be there for the look of the webpage to remain the same. The biggest offenders are pretty easy to spot.
Here’s what to keep an eye out for:
<span>’s that have no classes, IDs, or attributes
There might be a good reason that you have a
<div> as a container for a collection of elements because of its inherent block properties, or maybe you have the styling on the parent element that targets all of its children, but more often than not if I see a bunch of plain
<div>’s being nested, they are serving absolutely no purpose whatsoever. Along the same lines, those plain
<span>s are usually a red flag. The entire purpose of a
<span> is to be able to target something small that you want to style differently than everything else around it.
Like this span page from MDN says, “It should be used only when no other semantic element is appropriate,” and if you look at that page, you’ll see that both examples show it being added unnecessarily. The first, because it is unstyled and does absolutely nothing. The second example is unnecessary because the background-color can be put on the
<li> element. A much better example of a proper use case is:
Where there would be a class that makes that one word Blue, and the rest of the text in the paragraph is a different color.
Another thing to look out for is multiple
<br/> tags in a row. This is usually a sign that the element correct element.There are other elements that might be hanging out for no reason, too, but I’m focusing on these because they’re the two biggest offenders that I have come across.
I tend to spend a lot of time in the browser’s inspector, and sometimes I’ll see multiple elements that contain nothing. Not elements like the ones in the previous section, but elements that look like they should actually be doing something, because they have IDs, classes, and/or attributes.
So I immediately wonder why it’s there and go look up the code, and it’s almost always the result of some conditional.
Or what I tend to see in React code is that there won’t even be the conditional, but it’ll have a variable value there that happens to be blank, so you see nothing on the page, but these nodes still are clogging up the DOM for no reason. Now as with anything, there are perfectly good use cases for having placeholder elements in the DOM, like if you know you’re going to be loading something there dynamically, for example. But if that’s not the case then I encourage you to place the HTML elements within the conditional and if there is no conditional, add one!
Using the right tool for the job
Sometimes you can reduce the amount of nodes by simply using the proper HTML element, which is why it’s important to be familiar with them and what they do. Every HTML element has inherent properties and semantic value: there are block level elements, inline elements, elements that have more meaning than others (semantics). The elements behave in different ways and if you aren’t using them for what they were meant for, you might have to have extra elements, and you might have to add more styling than would have been necessary. A really extreme example would be putting a
<div> on a page with some text that you want to click on in order to visit another page by adding an
onClick handler to that
<div> and then making an AJAX call or calling
window.location("/link.html"). You’re probably not going to do that right? Why go to all that work, when you can simply use a normal
<a> tag for your link that has all that functionality built into it?
Some less extreme, but common examples are cases like the multiple
<br/> tags I mentioned above. If you are trying to put extra space between two blocks of text, then you probably aren’t using paragraph tags and should be. If you are separating out chunks of text within a single paragraph tag, then you aren’t using enough paragraph tags. Paragraph tags are block level elements that inherently have spacing above and below them.
Another example would be changing a
<span> through CSS to be a block-level element instead of just using a
<div>, or adding a bunch of CSS/JS to make a
<div> act like a button instead of just using the
<button> tag. You get a lot of functionality for free when you use the proper element. Buttons are inherently tabbable, they can use the
disabled attribute, and they also have many special HTML5 attributes that can do lots of things.
In general, I encourage you to explore all that different attributes that can be used on a particular element, because that’s how you learn just how powerful HTML can be, which brings me to the next special mention: HTML5.
You’ve probably seen “You Might Not Need jQuery” and “You Might Not Need JS”, but I’m here to tell you that you might not need anything but HTML. If you’re an old school developer, you might not be as familiar with all the fantastic things you can do with plain old HTML elements, and I’m always surprised that HTML5 isn’t used more being that it’s pretty old now. One of the areas it really shines is with forms.
You have an input field and you only want the user to be able to enter numbers, so you have a backend validation to ensure that the value is numeric when it hits the server. Backend validations are important; they are the last line of defense before it hits the database, but why waste that time traveling all the way to the server initially? So you decide to implement some client-side validations to catch it without making a server call, and that’s pretty easy to do using
$.isNumeric(value), but if you’re doing that, then you’re working way too hard. All you have to do is tell your field what type it should be:
<input type="number"> and it won’t even allow the user to enter anything but numbers. “But what if I want to make sure they only enter positive numbers? Or limit it?” you might ask. You still don’t need JS. Other attributes you can use on that field are
<input type="date"> will do this in every browser but Safari and Internet Explorer (sadly it falls back to a normal text input field), so plugins aren’t completely obsolete in that case, but it can fit a lot of use cases. Another fun one is
<input type="color"> for letting users pick a color from a pop-up palette. I encourage you to explore all the different input types and their attributes.
Another very important reason to understand HTML and its proper uses? Accessibility is important. I won’t elaborate all the reasons why because there are countless articles and books that will do that much better, but I will sum it up by saying that your website being accessible enhances the user experience for everyone. I’m also not going to talk about all the techniques to make your site accessible because that’s a huge topic all its own. What I want to focus on, in the spirit of this article, is how using proper HTML elements (the right tool for the job, again) will already make your webpage much more accessible than it would be otherwise.
I suppose having also studied print design and typography, that my love of clean layouts and meaningful structure has also carried over to web development. So I was pretty excited about the increase of semantic tags when HTML5 came around. Providing more meaning to the elements not only helps accessibility but is an all-around win, in my opinion. Long gone are the days where
<div>s ruled the lands. Now we have
<aside>, and so on. Even before HTML5 though, there were a large amount of tags that weren’t being used when they should be.
Let’s take tables for example. Tables no longer rule the web either, but they still have a good purposes and lots of associated tags that go with them. Yet somehow I’ll see table after table only make use of
td, oftentimes with the first table row trying to act as headers. If a table has headers, it should at the very least be using
<th> for the header cells, but ideally using
<thead> as well. The browser will treat these differently than just another table row, and semantically it means something.
I still see
<b> used more often than
<strong>, probably because old habits die hard. The former still have uses, but the latter should always be used when you’re trying to place actual emphasis on something; not for styling reasons, but because of the meaning it conveys.
How proper HTML helps keyboard accessibility
All sites should be fully accessible by keyboard, and there are certain keys that users expect to be able to use to do this, because it’s the standard. Most people know they can tab through form elements and tend to do it without thinking about it. This is another case where you want to use the right tool for the job, because the exact kinds of things on the page that the user would want to navigate to are tabbable by default, without you having to do anything. This is why structure is important, and this is why the right element is important. Links, buttons, and input fields all have that functionality built in. This is why you don’t want to use a
<div> in place of a
<button>, or a link that doesn’t actually use an
<a> tag. It is also why the order of elements matter, because tabbing follows DOM structure, and not the visual structure on the page. So if you’re floating
<button>s and they appear in a different order on the page because of it, that’s going to confuse the user when the element they expect to be selected does not actually get selected.
Sometimes that’s not going to cut it. You might have a link menu hidden as a dropdown, and then you’re going to have to make that parent element open and close when tabbed to, but by being mindful of using the right elements for the job, you’re still halfway there.
Links also accept the “Enter” key by default. So if you are focused on a link and hit enter, it will go to that link. In forms, if you are focused on any input field within a form and hit enter, unless there is underlying code telling it to not behave like normal (here’s a nice article about why you should rarely suppress this), it will automatically submit the form.
How proper HTML helps screen-readers
There are a large amount of people in the world who have visibility issues, so they make use of screen-readers, and there is no feature that you as a developer turn on for this to work. A screenreader can read what’s on any website, but how much of that is coherent does depend on what you as a developer have put there. This is why the aforementioned semantics are important, but there are also some basic HTML attributes that aid screen-readers. One that most people know about is the
alt attribute on images (alternative text), and yet it is so neglected. It’s often left blank when it shouldn’t be, filled in when it shouldn’t be, or the information doesn’t describe the image at all, rendering it useless. Let me hit on each of those points:
- If it is important that a particular image is conveyed to the user (a graph, a visual representation of something being described, etc) then you absolutely want an
- If the image is completely unimportant, like a visual flourish, or an icon to represent text that comes immediately before or after it, then you do not want an
altattribute. Why? You’re not only throwing a mess of unimportant information at the user, but in many cases you’re just making it repeat the same word twice, which is confusing. This is also where ARIA roles come into play, but the most basic thing you can do in this situation is to intentionally identify a blank
alt=""attribute. If you don’t put in any
altattribute at all, the screenreader/browser will just guess it by reading the file name or
- If you have an image of the user and your
alttext is “User”, or it’s the logo of your company and it just says “Logo” or “Company Name”, you are not helping anyone. At the very least be a little more descriptive: “Your profile picture”, “Company Name’s logo”, but even better, be really descriptive if it’s an important visual. “Image of a brown horse with black hair” helps you imagine something much more than “Pic of horse”.
Another common misuse of HTML I come across is not labeling input fields. All input fields should have an associated label element and this is not accomplished by just having text near an input field. It may look fine to you visually and you might be able to deduce that they go together, but screen-readers won’t. Screen-readers expect that any input field that is selected should be able to tell them what that input field is for. There are two main ways to make sure your fields are labeled properly. The first is to simply wrap your input field in the
<label> element like this:
I know some developers might not like to use that method because they have special styling on all
<label>s that they don’t want on the
<input> field and that’s fair enough, you just have to get explicit, and this is where I see the HTML fall way short.
The above does not associate the label with the input field. Proximity does not equal association. This is where the
for attribute comes in.
The above associates the fields by matching up the
for attribute with the
id of the field. If your
for attribute and
id attribute don’t match up, then it does not work (I see this happen a lot). Sometimes you might not want the label there visually, or have an extra field associated with that same label. No problem, you can either hide the label, or add an ARIA attribute to the input field that specifies the labeling, but never just leave a field label-less. Screen-readers don’t read placeholders. I should also note that if you want your form elements to be fully accessible for all screen-readers then it’s best to cover your bases and have both the
for attribute AND nest your
input inside the
An added benefit of labeling inputs properly: they’re easier to select and click, which makes them more accessible! Especially checkboxes and radio buttons, which have a very small click area. This is because it allows you to be able to click on the label text to select the element. Also note that for radio buttons, you should use the
<fieldset> tag to group the radio inputs together, where the
<legend> serves as the parent label while each input still has its own.
These are things that should be considered the basics of HTML, making use of the built in attributes and structure to better serve a diverse user base.
You don’t have to work as hard
If anything else, understanding HTML should be encouraged because it makes your work easier. It can reduce the need for both styling and scripting, and makes the world wide web a happier place.
- HTML: HyperText Markup Language
- DOM: Document Object Model
- ARIA: Accessible Rich Internet Applications