Thread by @techpeace, Really enjoyed the @voicebotai office hours today with @bretkinsella, @einkoenig, and @chrismessina [...]

Really enjoyed the @voicebotai office hours today with @bretkinsella, @einkoenig, and @chrismessina talking about voice on the web. Here are my thoughts on the discussion after six years of building for the #VoiceFirstWeb

At Voxable, adding voice interaction to the Web is close to our

- our first project as an agency involved adding a voice interface to a web app. Some aspects of building voice apps on the Web have improved, but other things have gotten more difficult.

Chris mentioned the possibilities for incorporating voice into e-commerce experiences. In 2016, I gave this demo of the advantages of voice search in e-commerce admin interfaces, which demonstrates the value of voice for complex search:

That's just the Web Speech API I'm using in the demo. I'm demoing with Chrome, since, at the time, the Web Speech API's speech recognition capabilities were only available in Chrome. Half a decade later, and... it's still only Chrome: https://caniuse.com/speech-recognition

At Project Voice 2020, Mozilla made some really exciting announcements regarding voice on Firefox. I wrote about it a year ago (this is the article Jan kindly mentioned towards the end of the call): https://www.voxable.io/blog/time-for-an-open-voice-web

It’s Time for an Open Voice Web

Our main takeaway from Project Voice: it's time to start building an Open Voice Web.

https://www.voxable.io/blog/time-for-an-open-voice-web

A year later, and the team responsible for Firefox Voice, which would have brought speech recognition (and wakeword support - more on that later) into Firefox was laid off. As expected, their projects have been discontinued: https://twitter.com/techpeace/status/1355671924286885893

https://twitter.com/techpeace/status/1355671924286885893

Mozilla thankfully open-sourced all of their (awesome!) work, so hopefully, the community will be able to continue with these excellent projects. I built a demo with their in-browser wakeword library last year: https://twitter.com/techpeace/status/1298640734124478464

https://twitter.com/techpeace/status/1298640734124478464

Bret published a great interview with @ianbicking, the head of the team at Mozilla that worked on Firefox Voice, about their work: https://voicebot.ai/2020/09/07/ian-bicking-talks-firefox-voice-and-observations-about-assistants-today-voicebot-podcast-ep-166/

Ian Bicking Talks Firefox Voice and Observations About Assistants Today – Voicebot Podcast Ep 166

Ian Bicking spent 10 years at Mozilla as a software developer, engineering manager, and research engineer. His last project there..

https://voicebot.ai/2020/09/07/ian-bicking-talks-firefox-voice-and-observations-about-assistants-today-voicebot-podcast-ep-166/

Also last year, @jovotech released Jovo for Web, with a series of excellent starter projects for building your own multimodal voice interactions on the Web: https://www.jovo.tech/news/2020-10-29-jovo-for-web-v3-2

Jovo for Web: Open Source, Customizable Voice & Chat for the Browser | Jovo

Jovo for Web allows you to build fully customizable voice and chat apps that work in the browser using the Jovo Framework and Vue.js

https://www.jovo.tech/news/2020-10-29-jovo-for-web-v3-2

So, in 2021 - where are we? Speech recognition is still only available OOTB in latest Chrome. It's powered under the hood by Google Cloud Speech. This gives Google outsized power over the future of the #VoiceFirstWeb.

It's possible to add in-browser wakeword support with the library Firefox was using to support "Hey Firefox" - this is an important aspect of user experience with voice, and we'll need something solid before the #VoiceFirstWeb takes off: https://github.com/castorini/honkling

castorini/honkling

Web app for keyword spotting using TensorflowJS. Contribute to castorini/honkling development by creating an account on GitHub.

https://github.com/castorini/honkling

Later in the discussion, @rogerkibbe had some great points about the opportunity for the #VoiceFirstWeb: we use a strange input modality (tapping, swiping, typing) with a great output modality (the Web). Adding a more natural input modality to the Web has huge potential value.

Alexa Champion @bondad had a question for Jan about what we can expect in the next version of Jovo - but rather than spoil the surprise, I'll leave you to follow @jovotech for those announcements. Rest assured that it all sounds awesome!

(Btw, if you haven't yet, you should check out @bondad and @katybow's amazing Art Museum Alexa skill, winner of the Grand Prize in the Alexa Conversations Skills Challenge.) https://www.notion.so/Alexa-open-Art-Museum-f841e66775644a99a240df9cbf1cb7e3

🗣 Alexa, open Art Museum

Art Museum lets you find art with your voice. It's powered by a public API from the Art Institute of Chicago, and Alexa Conversations, Amazon's next generation AI-driven dialog manager. It was...

https://www.notion.so/Alexa-open-Art-Museum-f841e66775644a99a240df9cbf1cb7e3

We're still confident that the richest future for #VoiceFirst lies in open standards and platforms. Permission-less innovation offers the most room for growth. Building experiences without regard for whether or not the features you need are "1P or 3P" is a freeing experience.

If you agree, or if you're into making computers talk in general, consider following me or @voxable - and if you're a conversational designer, sign up for our beta! https://voxable.io

Voxable

The conversation design platform for teams that want to build better voice and chat apps.

https://voxable.io

Latest Threads Unrolled: