The spoken web: Can India get the next chapter of the internet right?

Here is a romantic idea: imagine an internet where there is no text, no pictures, nothing to click on, only sound. Where the barriers of language disappear and ordinary people – including those without English literacy, speaking Hindi or any Indian dialect into their mobile phones – can simply use their voices to unlock trustworthy information. This is the idea of the “spoken web,” and it has long had a particular appeal in India.

The term “spoken web” seems to have been coined at the Massachusetts Institute of Technology in the early 1990s but it was here in India that it found most admirers. Ten years ago, it was India-based researchers at IBM who first developed a working concept of the spoken web.

“People will talk to the web and the web will respond,” imagined Dr Manish Gupta of IBM in India, in 2009. His team created HSTP or Hyperspeech Transfer Protocol, similar to the HTTP of web pages but with the idealistic aim that ordinary Indians could use simple voice commands like “next” and “back” on their mobile phones to access trustworthy information and services.

Despite a noisy scene of satellite TV news, audio as a platform for information has a mixed history in India, with limits on radio news.

Potential of the spoken web

Back in 2009, I was a BBC radio journalist posted to India from the UK, based in the same bureau where my colleagues were then as now broadcasting audio to rural audiences, popular over shortwave.

Even back then, I remember seeing immense potential for audio journalism in the idea of a “spoken web”. In a country where the “next 500 million” internet users are speakers of vernacular languages, a spoken internet would have enormous power for the multilingual masses. Back then, I imagined the adoption of mobile phones, across India, would immediately bring with it a spoken internet in many tongues.

But it did not really happen.

Instead, mobile internet connectivity brought with it a range of other developments. The 2014 Lok Sabha elections were seen as the first “social media” poll. I covered that aspect too, looking at the positive aspects but also some of the darker edges, the trolling and cyber-bullying and manipulated hashtags that have become a feature of political life across the spectrum.

In the years since, messaging platforms like WhatsApp have taken off. Five years later, the Election Commission of India is so worried about inaccurate information spreading online that it is working directly with social networks; and sites such as Facebook are removing political news for fear that their algorithms are being manipulated.

But in the cacophony of the 2019 Lok Sabha poll all around us, the “spoken web” may yet have its moment.

The key will be to make sure it develops with values we can all stand by.

Now’s the time

Why now? The change, still embryonic but potentially a huge disruption to the way the internet works, has been the emergence of voice assistants, such as Siri, Alexa and Google Assistant, to access internet services. They are run by big tech firms and built into phones and smart speakers, with one in five UK homes, for example, already owning smart speakers.

Assistants are built on a unique technology known as Natural Language Processing, a capability based on machine learning (and therefore referred to by many as a form of Artificial Intelligence), which allows computers to understand not only what people say – in potentially any language – but also what they mean.

A new medium has been created, for two-way audio. And now, just like the early days of social media, online video, or many other changes, content makers have a new way to reach the public. For this Lok Sabha’s elections, you’ll hear briefings by various media companies being supplied to smart assistants.

(I’d be remiss if I didn’t mention that we at the BBC have one too. I’m proud to say the first interactive audio briefing we have done anywhere in the world is in Hindi and for this poll, available to users of the Google Assistant who simply say “Talk to BBC elections”).

For us, is just an early trial of what we think a “spoken web” service might look and feel like, the technology is still new and we hope to learn along with the audience.

But like all other parts of the internet, even at this early stage, there is much at stake. The original dreamers behind a “spoken web” saw it as a valued space which ordinary people could access and which improved their lives. Issues like trust, knowing who you as a user are talking to and the provenance of information, will be key. Given the transformative impact of the old text-based internet and if the social web on politics and society, we need to get this next chapter right.

Mukul Devichand is Executive Editor of BBC Voice + AI, a unit based in London that is developing a number of new services for smart assistants, including “BBC Kids” and more. Their latest offer is “Talk To BBC Elections,” aimed at Indian users of the Google Assistant in Hindi.

We welcome your comments at letters@scroll.in.

Get the app

ANDROID iOS

The spoken web has huge potential in India. But can we get this next chapter of the internet right?

We need to make sure it develops with values we can all stand by.

Potential of the spoken web

Now’s the time