Its obvious, isn’t it? Any voice application is bound to be a winner. We all love being spoken to in leisure or learning moments. What is the easiest way in which to absorb information? Have it spoken to you. From the audio book to the sat nav machine, voice works. As humans, we can project so much onto a voice. Its “colour” gives instant clues, and even the road directions to Southend-on-Sea can become injected with implied threat or promise. And hearing things is restful, even absorbing. Having a novel read in one ear can be superbly engrossing, and while there is always the risk of being alienated by the reader’s interpretation, chances are that the audio book will be the way we “see” that text, once we have heard it, for ever. I have an old record of T S Eliot reading The Waste Land which I can no longer play because I have no form of media that will play it. So I naturally became an early user of the App, which has 9 versions of the poem being read, including the poet himself. Most of them are far better, but because I heard it first, when I read the poem aloud myself, I find that I use the poet’s cadence and timing. In other words, voice imprints and can be unforgettable.

Which brings me to Siri. The Apple iPhone voice App has now had three months of shrill publicity (http://www.transhumanistic.com/2011/10/new-iphone%E2%80%99s-killer-app-%E2%80%93-voice-controlled-personal-assistant/) and (http://www.youtube.com/watch?v=3uo5CUgEYKI&noredirect=1).

Given its ability with natural language searching, which gives it a degree of “intelligence”, reviewers think this should be a winner, and I agree on one level. On another I have some reservations, and these are largely concerned with our apparent inability to position and market voice services effectively.

Twenty years ago a senior executive at Random House told me that I was wasting my time with “Multimedia”, which was what we were then working on for CD-ROM. All the market wanted, he said, were good audio readings to play in the car on long distance travel, and he introduced me to his bright young manager who was providing just that. That manager told me two things that have stuck with me: one was the now obvious reflection that publishers were rubbish at marketing anything at all, and this would never change since they believed that they could sell anything. The second was that voice markets appeared to him to be finite: you quickly reached the voice susceptible segment, then growth got very hard. It is a thought that comes back as even Barnes and Noble discover digital (http://www.publishersweekly.com/pw/by-topic/industry-news/bookselling/article/49567-barnes–noble-sees-bright-future-in-digital.html). And who would have thought that would happen!

My young friend of then is now the manager of an important media venture fund, so I will preserve his anonymity. And I do not want to argue that eBook or digital versioning is similarly finite. But I do want to suggest that voice is a vital component of the network and thus of digital service provision, that we grossly neglect its impact in product and service development, and that but for two unfortunate voice misuse environments we would be using a great deal more in more intelligent environments. I am told for example that voice search is now a really easy application to roll out in many service contexts. However, the reason given for its relatively modest showing is the prevalence of hugely annoying telephone voice menu systems, which daily have reasonable people howling in frustration. Having discovered a rare four tier example this week in a hospital group, I am tempted to initiate an award scheme for organizations who employ human beings to answer the phone. The second is automated public service messaging in airports and elsewhere, but in terms of both the problem is not voice, but marketing. I even encountered an airport lounge in my October travels which announced, every five minutes, that no flight departure announcements would be made and that passengers should consult the information screens!

For all of these reasons the future of voice is vital. Siri may point the direction towards intelligent guidance, but completely voice-directed computing has been feasible for a long time and must be a part of the five year scenario. And you do not need to have a Babelfish in your ear to believe in voice/language text translation, which the network is begging for in countless sectors and which is increasingly feasible at a basic level. Slowly we will edit out poor voice practises and it will become rare for web environments to lack audio components as it is for them now to lack video activity. I have had the pleasure recently to work with a group in Dublin who are creating virtual environments to help students pass tests in proficiency in spoken languages. There is an early example at http://www.examspeak.com but there is much more to come. The network is the ideal environment for voice-based training, language learning and virtual voice service development. Eventually the digital communications revolution will come full circle and re-integrate voice as the critical element in networked communications that it always has been, and we shall wonder why this component took so long to fall into place.

And then, we shall call the health insurer through the network and hear his computer say, “Forget all those options and numbers – tell me how I can help”!


Comments

Name (required)

Email (required)

Website

Speak your mind

 

1 Comment so far

  1. Phil Cotter on November 23, 2011 23:20

    David
    Like you I have wondered why voice hasn’t played a bigger part in the development of the digital network. As an early adopter of Orange’s (now defunct) Wildfire voice service which provided true hands free dialling in the car, albeit with the occasional linguistic idiosyncrasy, I have been surprised that voice continues to be underrepresented.

    Voice activated operations for mobile devices in seems to be an obvious application, not least because of the physical limitations imposed by including a QWERTY keyboard. It also seems inconceivable that the most natural means of communication between humans would be displaced in the digital future by text & video.

    I also see one further benefit from wider adoption of voice applications. Devices will be able to assume a personality and in doing so perhaps enrich the user experience even further. Wildfire had a rather coquettish personality I missed her when she was retired. Perhaps voice is the key to developing truly personalised connectivity.