Using Hexidecimal Entities to Display Special Characters, or How A&W broke your ebook

Any type designer worth their weight knows the difference between “this” and “this,” and why a dash is fine for 9-5, but when you want a little drama—there’s nothing like an em-dash. But do they know the raison d’être of HTML entities, or why when Blimpy orders a Papa Burger in the fifth chapter, their file won’t validate?

When to use entities

Using entities in your code allows you to gain access to a world of unicode characters outside of the seven-bit ASCII available when coding plain-text XHTML, including international characters and design characters like smart quotes and em- and en-dashes. Entities also help you make clear what the code might confuse: if you have an ampersand in your text, a browser or ereader may interpret it as code—the opening of an entity, ironically. To make it clear when an & is only an &, use the entity & in your HTML.

If you want unusual white space in your text, sprinkle some   in your code, (that’s the entity for a non-breaking space.) Ereaders will know exactly          what          to          do.

(We dropped ten non-breaking spaces between each word.)

How to use entities

As the computer scientist Andy Tanenbaum once said, “The nice thing about standards is that you have so many to choose from.” There are a number of ways to encode characters, but we’re focusing here on hexadecimal codes as a stable method for ereading devices.

Hexadecimal entities take the form &#x__;, replacing the underscores with the unique hexadecimal number assigned to the unicode character you’re after. Using them is a piece of cake. Just drop the correct entity into your HTML code where you want the character to appear in the text. The code for an ampersand is &, so the sentence “He sauntered up to the cashier at the A&W.” looks like this in your HTML file:

He sauntered up to the cashier at the A&W.

Rather than defaulting to dashes and straight quotes, we’ve used entities throughout this article. Feel free to have a look at the source code to see entities in action.

A handy guide to entities for Canadian books

For reference, we’ve pulled together a chart of the unicode entities we think you’ll use most often in your books.

‘ ’
“ ”
– —
© © & &
« « » »
À À Œ Œ ê ê
  ٠٠ë ë
Ä Ä Û Û î î
Ç Ç Ü Ü ï ï
È È Ÿ Ÿ ô ô
É É à à œ œ
Ê Ê â â ù ù
Ë Ë ä ä û û
Î Î ç ç ü ü
Ï Ï è è ÿ ÿ
Ô Ô é é
Non-breaking space  

Find a complete listing of unicode characters, and their hexadecimal numbers, at http://unicode.org/charts.

Don’t forget to test everywhere

The entities above are probably pretty safe bets, but remember when using entities: just because a character has a unicode number, doesn’t mean all ereaders will display it. The question of which characters to support is left to the discretion of device manufacturers, so if you’re using a character you’re uncertain of, don’t forget to test your ebook on every device you can access.

Tags

Latest Resources

Need help with alt text?

If you're taking on the task of writing your own alt text, you may feel a bit overwhelmed—but eBOUND is here to help! Below is a downloadable template you can use with a few examples. You can use...