Really enjoyed the @voicebotai office hours today with @bretkinsella, @einkoenig, and @chrismessina talking about voice on the web. Here are my thoughts on the discussion after six years of building for the #VoiceFirstWeb 👇
At Voxable, adding voice interaction to the Web is close to our ❤️- our first project as an agency involved adding a voice interface to a web app. Some aspects of building voice apps on the Web have improved, but other things have gotten more difficult.
Chris mentioned the possibilities for incorporating voice into e-commerce experiences. In 2016, I gave this demo of the advantages of voice search in e-commerce admin interfaces, which demonstrates the value of voice for complex search:
That's just the Web Speech API I'm using in the demo. I'm demoing with Chrome, since, at the time, the Web Speech API's speech recognition capabilities were only available in Chrome. Half a decade later, and... it's still only Chrome: https://caniuse.com/speech-recognition
At Project Voice 2020, Mozilla made some really exciting announcements regarding voice on Firefox. I wrote about it a year ago (this is the article Jan kindly mentioned towards the end of the call): https://www.voxable.io/blog/time-for-an-open-voice-web
A year later, and the team responsible for Firefox Voice, which would have brought speech recognition (and wakeword support - more on that later) into Firefox was laid off. As expected, their projects have been discontinued: https://twitter.com/techpeace/status/1355671924286885893
Mozilla thankfully open-sourced all of their (awesome!) work, so hopefully, the community will be able to continue with these excellent projects. I built a demo with their in-browser wakeword library last year: https://twitter.com/techpeace/status/1298640734124478464
So, in 2021 - where are we? Speech recognition is still only available OOTB in latest Chrome. It's powered under the hood by Google Cloud Speech. This gives Google outsized power over the future of the #VoiceFirstWeb.
It's possible to add in-browser wakeword support with the library Firefox was using to support "Hey Firefox" - this is an important aspect of user experience with voice, and we'll need something solid before the #VoiceFirstWeb takes off: https://github.com/castorini/honkling
Later in the discussion, @rogerkibbe had some great points about the opportunity for the #VoiceFirstWeb: we use a strange input modality (tapping, swiping, typing) with a great output modality (the Web). Adding a more natural input modality to the Web has huge potential value.
Alexa Champion @bondad had a question for Jan about what we can expect in the next version of Jovo - but rather than spoil the surprise, I'll leave you to follow @jovotech for those announcements. Rest assured that it all sounds awesome!
We're still confident that the richest future for #VoiceFirst lies in open standards and platforms. Permission-less innovation offers the most room for growth. Building experiences without regard for whether or not the features you need are "1P or 3P" is a freeing experience.
If you agree, or if you're into making computers talk in general, consider following me or @voxable - and if you're a conversational designer, sign up for our beta! https://voxable.io 
You can follow @techpeace.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.