What you need to know about voice-based marketing

We investigate why search marketing is still finding its voice and how voice-based customer engagement is evolving

When Google first demonstrated its Duplex service in mid-2018, it stunned the world by making relatively complex bookings over the telephone using a system that sounded very close to how a human might sound.

It also breathed new interest into a technology that has become both increasingly prevalent in consumer devices, and an increasing source of frustration – the voice-based user interface.

The launch of Apple’s voice-based assistant, Siri, in 2011 spawned a revolution that now sees similar technology embedded in almost every smartphone. While the installed base for smart speakers such as those connecting to Amazon Alexa or Google Home is expected to reach 100 million globally this year.

And yet the technology remains stubbornly unsophisticated in its application, being able to only respond to basic requests in a limited subset of services.

So, is voice poised to revolution human-to-machine communication or become the latest in a long line of technologies whose promise exceeded its capabilities?

Getting into voice early

It is a question some marketers have begun to ask for themselves. Many were present for the local launch of the Amazon Alexa service in early 2018, including Village Entertainment, which offers voice-based services via Alexa enabling customers to enquire about movie sessions, locations and other information and then receive a link to their phones to make a booking.

“We intuitively see voice playing a bigger role going forward with our service offering widening,” says Village Entertainment general manager for marketing and sales, Mohit Bhargava. “From fully voice-enabled ticketing services, where guests can make a transaction with us using voice, through to other applications such as customer service in venue. Our focus and investment in voice will evolve in line with overall market penetration of voice technologies.”

Chief marketing officer at, Malini Sietaram, also believes integrating voice search into the customer experience is an absolute must for marketers who want to make a meaningful connection with their audience.

“There’s a big opportunity for marketers to monetise through voice search and to tap into localised or ‘near me’ searches, like if a customer is trying to find their nearest retailer, gym or supermarket,” Sietaram tells CMO. “According to Google, these localised searches are skyrocketing in volume and I definitely think this will be an exciting space for marketers to play.”

In January 2019, the company will launch its Finder assistant, which will connect users with the content and financial products they’re searching for and enable partners to quickly tap into voice.

There is certainly a strong potential audience of Australians waiting to tap into voice services. Along with smartphone penetration, Telsyte estimates 3 million Australian households will have a smart speaker by 2022, equating to 30 per cent household penetration.

And as a channel, it has some interesting attributes that could make it increasingly attractive to brands. A recent study by Publicis Media demonstrated a significant memory effect and heightened physiological responses when interacting with smart speakers. The study found voice delivered nearly twice the unaided brand recall of with television and on par with native mobile. Voice also stood out as one of the best experiences compared to TV and native mobile, having been found to be more engaging, fun, helpful, useful, informative and less boring.

But for most Australian brands, it remains a peripheral issue.

“It is till something on the periphery even though there is much higher prevalence of access to voice platforms,” says principal consultant at software consulting firm ThoughtWorks, Ian Kelsall. “It is identified as an add-on at the end, and not necessarily considered deeply.”

ThoughtWorks is, however, working with several clients to explore the long-term prospects for voice and where it might provide a contextual benefit, including US-based Sonic Restaurants.

“On the way to the drive-through, is voice an appropriate place to perform that interaction of making your order ahead of you getting there?” Kelsall asks.

He suspects the reason for tepid enthusiasm is that to date the reality of voice interaction has rarely matched the excitement of the demonstrations.

“One of the challenges of voice is when a user asks a question there isn’t a lot of context for that voice AI to draw on unless the consumer is being specific about their question,” Kelsall says. “Most voice interfaces at the moment, although they are framed as a conversational UI, are not really conversations - you ask for something and it tells you an answer.”

Up next: The implications for search, plus the back-end capability required to succeed

Page Break

The implications for search

That attribute does make voice technology ideal for use as a search mechanism. ComScore predicted in 2017 that by 2020, 50 per cent of all searches will be voice searches.

Head of evangelism for search at Microsoft, Christi Olson, says voice is providing a new modality to existing search behaviour, with searchers asking questions and looking for timely, relevant, and answers they can trust. However, she says voice queries tend to be longer and more conversational in nature.

“The average length of a text search query is between two to four words in length, whereas for voice search the average query is between three to seven words in length,” Olson says.

There are also core characteristics to the anatomy of voice search queries that make them more personal and conversational.

“First, voice queries tend to include question words,” Olson says. “These question words give us some insights in searcher intent and potentially insight into the purchase journey.

“Second, voice queries tend to be spoken in the first person, and they tend to include permissive modal verbs and verbs that reflect a digital assistant’s assistive nature.”

Olson gives examples such as; “Hey Cortana, can you send Roy a text message telling him…”, or “Hey Cortana, how do I make chicken parmesan?”. And she believes the sophistication of what people will be able to request should evolve quickly as technologies come together.

“Where the evolution of search and conversational AI is converging is in the conversational AI technology enabling searches to ask questions and not only get answers to their questions, but for voice skills and chatbots to be able to engage with the digital assistants to enable consumers to take action,” Olson says.

“Based on research from Bing’s The Consumer Adoption of Voice Assistance and Voice Technology research, there is a gap today in consumers’ ability to use voice technologies for complex tasks that involve multiple steps, such as making an appointment and potentially ordering food from their favourite restaurants. This is where voice skills and chatbots can assist. Cortana can engage with voice skill or chatbot that’s been created on an open framework.”

Right now, however, there appears to be a strong and growing audience of consumers using their voice for search, with the head of global marketing at SEMrush, Olga Andrienko, already believes up to 30 per cent of searches are now searching using their voice.

“The commands through voice search have a precise intent and are a lot more action-driven,” Andrienko says. “Even if the query is informational, it has a clear intent in mind and the user expects an immediate response.

“Voice search is about questions, prepositions, and comparisons. If it’s not an informational query, people are likely to search for location-based info.”

This leads to a natural alignment between voice search and Google’s Featured Snippets (also known as position zero) which sit at the top of search rankings in those instances where Google has determined the source provides a definitive response to a query.

Andrienko cites a Backlinko study claims that 40.7 per cent of answers come from the featured snippet, and hence SEMrush’s SEO clients are putting more emphasis on getting into this feature.

She also believe voice will play a bigger role in Google’s overall offering, as demonstrated through its decision to voice-enable the API in Google Pay, and the introduction of audio ads in DoubleClick.

The back-end capabiility

A critical requirement for any brand that wants to play in the voice space is the implementation of the Speakable Schema Mark-up. This is the structured data within a website’s HTML source code that provides context and guidance to the search engines and digital assistants as to what the suitable spoken response is for content on their website.

“A Google device will only read you the recipe of the website has implemented the structured data mark-up,” Andrienko says. “Also, audio SEO for optimising podcasts for search is about to pick up. For many businesses who monetize with audio content that will double their audiences with proper optimisation.”

“And what it all means in the future is brands might need to create a voice version for their websites, where everything could be easily heard and understood without user looking at the screen at all.”

That could still be some way off however, as we have a long way to go still before even basic voice search becomes an absolute norm.

While the implementation of speakable schema mark-up seems essential, a bigger challenge for many marketers may lie within their overall content proposition, as the natural affinity between voice search and the Featured Snippets box means brands will need to spend more effort ensuring they are represented within it.

“In order to get that zero rank you need to write a lot of content,” says founder and head of growth at the online marketing agency, King Kong, Sabri Suby. “You are never going to rank in that zero-rank position for a service page on your website - it needs to be a very detailed blog post that answers a very specific problem.

“People are getting more specific and using longer keyword strings. That means the content you are writing to be able to feed the information back through these interfaces needs to be more specific, because Google is looking to serve up the most accurate and relevant search results. So as a brand you need to be bullishly moving into content marketing and answering all of the specific questions that people have in your marketplace.”

Hence Suby says any brand that doesn’t have content as part of their strategy moving forward ism likely to be left behind.

“The more you do that, the more land you will be able to grab during this time as people optimise their website for search engines but also for voice search,” Suby says. “And it is important right now, but in this next wave of voice it is going to be even more important. “

The key question then is how long brands can afford to wait before they start to invest for a possible voice-based future.

“We are really on the cusp of what’s to come and people don’t really understand what the meaning and the severity that is going to have for how people find and discover brands,” Suby says. “We are yet to see the impact that will have, but over the next three years is where we are going to see it take off.

“So it is not a conversation that is coming up all that often to be honest, but moving forward it is going to be an increasingly important conversation to have.”     

While their use cases remain relatively limited, OC&C Strategy Consultants have predicted US and UK voice commerce will grow from US$2 billion to 40 billion-plus by 2022.  

Follow CMO on Twitter: @CMOAustralia, take part in the CMO conversation on LinkedIn: CMO ANZ, join us on Facebook:, or check us out on