Adding speakable markup to your website can let voice assistants deliver the content using text-to-speech. This article covers three different ways to use the speakable specification.
What is the speakable specification?
The structure for indicating that certain parts of a website are especially appropriate for text-to-speech comes from Schema.org. Though founded by Google, Microsoft, Yahoo and Yandex, Schema.org vocabularies are developed by an open process. Most of the major search engines, including Google, utilize these vocabularies to create richer experiences.
The Schema.org vocabulary defines over 600 types and 900 properties. But in this post we are looking specifically at the speakable property the Speakable Specification type.
From the definition of the speakable property:
Indicates sections of a Web page that are particularly 'speakable' in the sense of being highlighted as being especially appropriate for text-to-speech conversion. Other sections of a page may also be usefully spoken in particular circumstances; the 'speakable' property serves to indicate the parts most likely to be generally useful for speech.
Right now "speakable" is the best standard for publishing web content designed for use by voice assistants like Alexa and Google Assistant. In fact Google already gives guidelines for how its Google Home reads speakable news. For the rest of this post we'll assume you already understand the benefits of publishing voice-optimized content and just want to learn how to actually do it!
How do I make content speakable?
To mark content as speakable you need to add structured data to your web page that instructs search engines and applications where to look. There are multiple formats of structured data including JSON-LD, Microdata, and RDFa. We recommend using JSON-LD because it is preferred by Google.
Speakable is a little different from most other JSON-LD structured data. In most cases structured data is in a key-value form. For example, a Person type could have an email property. You would just define this as
But with speakable you instead define the location of the content elsewhere on the website. There are three different ways to define the location, the ID of an html element, a CSS selector, or an XPath selector.
Speakable with HTML element IDs
Using HTML element IDs is actually the simplest way to set up the structured data. For some reason it's rarely shown in examples.
To handle more complex use cases, the SpeakableSpecification defines a way to use CSS selectors to identify the speakable content. We find this method handy since many web developers are already understand CSS selectors well.
<head> <title>This is the speakable title</title> <meta name="description">This is the speakable description</meta> </head>
The easiest way to add speakable content
This post covers the technicals details of adding speakable content. If any of this looks daunting, we have you covered. Soundcheck has a WordPress plugin that lets you add speakable content as easily. The plugin handles the structured data. You just have to click to add a speakable block and write 2-4 lines of content.