Suddenly, everybody seems to be talking to their computers, TVs and mobile gadgets.
Voice command software has been around for decades. The software currently sold as Dragon NaturallySpeaking by Nuance had its origins in a 1982 prototype — more than three decades ago — and as consumer product in 1997.
Since then, voice command features have emerged here and there, often ignored. For example, the Apple OS X features built-in voice command and dictation capability, but it’s rarely used.
Voice command has been popular with a minority of technical users and power users, as well as some professions (such as writers using dictation). I’ve used it myself on occasion. But for average users, voice command didn’t really register until Google and Apple started building it into their mobile operating systems, first iOS’s Siri in late 2011 and Android‘s Google Now and Voice Search in the summer of 2012 (both Siri and Google Now and Google Voice Search existed in previous products used by a relatively low number of users).
Despite initial promise, both these products failed to live up to their promise. Siri, for example, started out with a lot of hype and attention, but suffered from server delays and outages and general unreliability. Many users who started using Siri in the early days after iOS integration gave up on it.
Likewise, Google Now and Voice Search is very good, but Google hadn’t done a good job making it obviously available.
In both the cases of Siri and Google Now/Google Voice Search, the use of these powerful features wasn’t so convenient, obvious or necessary that a real majority of users would take advantage.
Suddenly — seemingly out of nowhere — voice command is getting true mainstream acceptance
Here’s what’s new.
Microsoft
Microsoft released its long-awaited Xbox One product Nov. 22, the first major new Xbox product since the Xbox 360 shipped exactly eight years before.
The new Xbox One has far better hardware performance and software and service options than the old Xbox. But one of the biggest changes is truly useful voice command. Now you can say “Xbox: on” to turn it on, or “Xbox: Skype Steve” to make a video call to Steve through your TV.
Importantly, the Xbox One’s voice prompt is “always listening.” This is important to drive usage and discovery. For example, I’ve triggered it several times by just having conversations in the room about the Xbox One. Just saying the word “Xbox” changes the screen to present voice command options.
One comical problem with Xbox One’s voice command feature is that TV commercials advertising those features trigger events on the Xboxes of people who already have it.
(Note that Xbox One commands are “locked” to the region, so they only work in some countries and command features vary from one language to the next.)
Still, the most interesting thing about Xbox One voice command is that it’s genuinely the fastest and easiest way to do a long list of actions and its use is encouraged by the console’s design. Also: A living room is probably the most comfortable place to use voice commands (which can be socially awkward in public places with a phone, or on a PC in the office).
Millions of people who never used voice commands before will start using it thanks to the Xbox One.
Microsoft’s console-gaming rival, Sony, shipped its PlayStation 4 earlier this month, and the console also supports voice commands. However, the range of things you can do with voice on the PS4 is very limited compared with Xbox One. You can “take screenshot,” “log in” or say “power” to shut the machine off. You can also launch a game by saying its title.
Sony promises more voice command support in the future.
Google this week announced a free extension for its Chrome browser that brings an Xbox One-like “always listening” search command to Google Search for Chrome users with Windows, OS X and those using Chromebooks.
It doesn’t record or capture anything you say except the magic words “Ok, Google,” which tells the extension to listen for your search command.
After installing the extension (which I predict will ship natively in future versions of Chrome), users see a gray microphone inside the search box on the main Google Search page, with the words “Say ‘Ok Google'” next to it.
Chrome users will probably use this because Google Search will give them a conspicuous reminder every time they visit the site.
Google also made Voice Search and Google Now more visible and inviting in the new version of Android, called KitKat.
Now, the main default screen has an “always listening” feature with a search box at the top. Saying “Ok, Google” launches the search.
Unlike the Moto X, which is truly always listening, even when the phone is in sleep mode, KitKat’s is in “always listening” mode only when the home screen or Google Now is on the screen.
Swiping from the home screen to the left in KitKat brings you to Google Now, which preemptively shows “cards” based in part on past searches.
Google Voice Search has also been improved with better contextual understanding. For example, you can say “Where is Big Ben,” and after it tells you, the questions “Where is it” and “how hold is it” are properly answered because Voice Search remembers the subject you’re asking about.
Here’s a really good video demonstration of what Google Now and Google Voice Search can do.
Comically, if you have a KitKat tablet, phone and a laptop running the Chrome extension on Google Search, and say the magic words, all three will execute the search at the same time from the same voice command.
The new Voice Search capabilities are rapidly spreading to users via gradual updates for specific phone models, and also starting to ship in new Android phones. There’s also a new iOS app for it unveiled this week.
Android is the world’s most heavily used operating system. The increasing usage of voice commands via Android could add hundreds of millions of people to the ranks of those who control their phones and tablets with voice.
Google announced this week that its social mapping product, called Waze, has been upgraded with more voice commands.
The iOS and Android app already talks to you with turn-by-turn directions, and warns you of police, roadside emergencies and slowdowns in traffic. But this week Google announced new voice command support.
For example, you can now use voice commands to search for addresses and locations, report “events” (traffic, hazards and so on),
They also rolled out a feature that enables celebrity voices to give you information. The first is comedian Kevin Hart.
Waze is increasingly integrated into Google Now and Google Maps, so we can expect Waze voice command features to follow the integration into those products.
Maps is a compelling place for voice command and interaction because it can be inconvenient, illegal and dangerous to type in or use touch input while driving.
I think Waze users are going to take advantage of these features, and their existence will get a lot more drivers talking to their phones.
Another Google project is driving voice commands: Google Glass. In fact, there’s no way to use it without seeing the prompt “OK, Glass” on the tiny screen. Once you say those words, you get a menu of the rest of the commands (such as “record a video” or “get directions.”)
Google Glass is one of the most voice command-centric products ever made. Nearly all its users talk as a normal part of usage.
Glass is not a publicly available product yet, but an invitation-only “beta.” However, Google has recently invited any and all developers to buy the headset. That announcement came after Google offered the current users to each invite three additional users.
The company is also expected to offer support for prescription glasses starting early next year.
Usage is growing. And as I said in this space recently, I believe Glass will become a mainstream product.
What’s interesting is that these are mostly mainstream products, and have you covered with voice command capability at home, work, in the car and while walking around.
Voice command technology has been pretty good for a long time. But its existence barely registered for the great masses of ordinary users.
Finally, however, talking to your computer, TV, phones and tablets is finally becoming a normal, expected part of everyday life. And it’s being driven mostly by these new products from Microsoft and Google.