Have you ever wondered how a box of ballpoint pens can cost you less now than when you were in grade school? It’s the cutthroat competition, stupid. And that’s also the reason why office supplies market leader Office Depot Inc. is leveraging every available technology to squeeze out pennies and speed growth.
With Staples Inc. snapping at its heels, the $10 billion retail titan has quickly embraced voice-enabled enterprise (VEE) applications as a way to reach more customers, make life a little easier, and cut costs. The service integrates telephony and speech technologies into a hosting platform that promises to reduce the use of expensive, staffed, customer-service call centers. VEE applications let customers interact directly with e-commerce databases like that of Office Depot, reducing transcription errors introduced by telephone sales clerks and virtually eliminating time spent on hold.
We’ve tried to put together a model of letting customers shop us any way they want, says Ken Jackowitz, vice president of business systems at Office Depot in Delray Beach, Fla. In addition to its concrete-block-and-mortar stores, Office Depot takes orders on its Web site and by fax, mail-order catalog, and telephone.
In November 1999, speech application service provider (ASP) NetByTel.com Inc. of Boca Raton, Fla., approached Office Depot with a proposal. NetByTel wanted to extend and streamline the company’s existing customer services offered via telephone by enabling callers to locate their nearest store or order a catalog by answering the prompts of an automated speech recognition (ASR) system.
Most corporations today are deploying VEE by contracting with speech ASPs. VEE applications typically require real-time telephony and speech recognition and synthesis services. These technologies carry steep learning curves; most companies can’t project a reasonable ROI if they take on the high up-front costs of deploying, operating, and frequently updating speech and telephony servers and software. At the same time, speech access to a company’s products and enterprise data promises to extend its markets and streamline operations.
|At a Glance: Office Depot Inc.
The company: Office Depot Inc. of Delray Beach, Fla., is the leading chain of office supplies superstores, selling more than $10 billion of paperclips, report covers, and white boards every year.
The problem: Customers want to be able to place orders by telephone, but staffed call centers are expensive to run, especially for a retailer that sells many small, low-priced items.
The solution: A voice-enabled ordering application gives customers the convenience of buying by telephone, while enabling Office Depot to cut per-call costs up to 88% compared to a labor-intensive call center.
The IT infrastructure: Office Depot chose speech ASP NetByTel Inc.’s Catalog Ordering Module and SpeechWorks Inc.’s SpeechWorks 6.0 speech-recognition software to connect calls directly to the company’s order-fulfillment database.
We’ve Heard Talk
For years there has been talk of accelerating improvements in the accuracy and ease of use of desktop speech-recognition products from companies like Lernout & Hauspie Speech Products N.V., Nuance Communications Inc., and SpeechWorks International Inc. to boost productivity and reduce repetitive stress injuries in tasks such as word processing. Now, those technologies, which enable PC end users to input and output text as spoken language, are trickling into the enterprise.
We’re seeing a shift from desktop-based dictation systems to network systems that allow true mobility and device independence, because the recognition is in the network, says Mark Plakias, vice president for voice and wireless commerce at analyst firm The Kelsey Group, headquartered in Princeton, N.J.
VEE applications are well suited to specialized environments like warehousing, where you need a hands-free solution, says Plakias, in New York. He also sees a market for horizontal road warrior apps, which can give mobile workers easy access to productivity tools and corporate data via cellular telephones and landlines. But for the foreseeable future, call centers are where the cash is in voice-enabled applications, according to John Dalton, an analyst with Forrester Research Inc. in Cambridge, Mass.
A handful of narrow vertical markets have already embraced VEE applications, according to Walter Tetschner, president of Tern Systems Inc., a Concord, Mass., research firm. Major airlines now let their customers retrieve flight-status information via voice-enabled systems, and online brokers are using speech as an interface for stock-quote applications, he says. Tetschner sees the overall market for speech-recognition technology growing to $1 billion by 2002.
Players in the enterprise speech services market include BeVocal Inc. of Santa Clara, Calif.; General Magic of Sunnyvale, Calif.; JustTalk Inc. of Ann Arbor, Mich.; IBM Corp. of Armonk, N.Y.; Interactive Telesis Inc. of San Diego, Calif.; NetByTel; PriceInteractive of Reston, Va.; Telera Inc. of Campbell, Calif.; Tellme Networks Inc. of Mountain View, Calif.; UCallNet Inc. of Santa Clara, Calif.; ViaFone.com Inc. of Redwood City, Calif.; and Webversa Inc. of Fairfax, Va.
In coming years, industry observers say voice-enabled applications will branch deeper into the enterprise. For example, salespeople will be able to leave their laptops at the home office and check inventory levels and place orders from their clients’ offices with a cell phone or voice-enabled handheld computer; purchasing managers will be able to participate in live B2B auctions while driving home from work.
Office Depot Takes Stock in Speech
All the potential payoffs of VEE applications notwithstanding, Jackowitz of Office Depot wasn’t an easy mark. My immediate response was, No thanks’, recalls Jackowitz. He was concerned about diverting resources from the company’s intense focus on the OfficeDepot.com Web site. But his competitive instincts prevailed. I saw the opportunity that we could be the first to offer speech-activated ordering, a more complex project that went live in September 2000.
What’s so great about giving customers the option to interact with an automated voice rather than a live customer rep? Customers like the fact that, at peak times, they aren’t put on hold to wait for a representative to become available. And Office Depot likes the cost, which Jackowitz says is up to 88% less then what the company pays for a staffed call center to handle a complex customer interaction. Office Depot pays NetByTel an undisclosed flat fee per call.
|“We’re seeing a shift from desktop-based dictation systems to network
systems that allow true mobility and device independence, because the
recognition is in the network.“
Here’s how the system works. When a customer calls Office Depot’s toll-free number, he’s given a choice of placing an order with a live person or through the speech system. If the customer chooses the speech system, his call is routed to NetByTel where it is handled within seconds by the firm’s ordering module; even at busy times, customers aren’t put on hold.
Entering a call flow jointly designed by Office Depot, NetByTel, and speech software maker SpeechWorks of Boston, the caller is prompted for key information such as a customer number, item numbers, and quantities. The customer speaks his responses, and an automated voice, either recorded or synthesized, confirms the customer’s instructions by reciting the shipping address on file, the names of the ordered items, and so on, until the order is complete.
The NetByTel system talks to our back end, including our warehouses and supply chain, Jackowitz says. We had designed a fairly open API for other applications, he adds. This made implementation of the speech system relatively simple and quick, requiring two weeks of scripting work from an in-house programmer. The company did not have to purchase any additional hardware for the system.
The speech-recognition component, running SpeechWorks 6.0 on NetByTel’s servers, does occasionally commit speech-recognition errors, some of which are discovered when the system validates users’ responses. If the speech-recognition system doesn’t recognize its own error, the user must either repeat his responses or back up through the hierarchy of prompts and correct the order or inquiry. However, according to Jackowitz, no one has complained about the system.
Jackowitz hopes to refine and expand the company’s suite of NetByTel-based VEE applications, which now includes a system that allows Office Depot’s truck drivers to communicate with warehouses by enabling their existing cell phones with ASR. NetByTel has stepped up to the scaling challenge, he says. Jackowitz would also like to see NetByTel offer additional reporting functionality, so he can better understand how customers are using the system.
Speech Counts at MyBeanCounter
Unlike Jackowitz, when Steve Kursh needed to choose a speech ASP for his new venture, he was placing a mission-critical bet. Kursh and his partner James Donovan founded MyBeanCounter.com Inc. to help mobile employees who need to track their time and expenses more efficiently and with less pain. The two felt people are happier interacting with a human voice–even if it’s recorded or synthesized–than they are grappling with a complex spreadsheet or straining their eyes with a handheld PC. And professionals are always looking for ways to crunch more tasks into their day without expanding their working hours.
Kursh searched far and wide for an ASP to host the MyBeanCounter voice system. He spoke with a number of vendors, including Exodus Communications Inc. of Santa Clara, Calif., Telera, and Syntellect Inc. of Phoenix. MyBeanCounter finally settled on PriceInteractive in the spring of 2000 to help build the application and to host it. PriceInteractive takes care of the back end from soup to nuts, and we do our stuff with the business, says Kursh, CEO of the Wellesley Hills, Mass., firm. In theory, this kind of functionality and performance is wonderful, but PriceInteractive was the only one that got it,’ that understood how speech software should interact with people.
When a company signs up for MyBeanCounter’s services, its employees gain access to a voice-enabled expense-tracking application via a toll-free telephone number. Our system takes advantage of dead time, like when you’re driving back from seeing a customer, says Kursh. In this scenario, an end user might use his cell phone to call into MyBeanCounter’s voice system and record an expensed lunch, billable hours, and the like, all by answering a series of voice prompts. Kursh’s system also has a useful customization feature: End users can personalize voice prompts to present lists of their own clients and projects, for example.
|Lessons Learned About Voice-Enabled Enterprise Applications
The voice-enabled application incorporates off-the-shelf speech-recognition and telephony software, plus custom-designed call flows. When end users call into MyBeanCounter’s toll-free number to track their time, their voice input is routed to an interactive voice response (IVR) server running on UNIX or Windows NT at one of PriceInteractive’s two operations centers. There, SpeechWorks 6.0 recognition software translates voice to data that is then stored in the user’s account in a SQL database hosted by PriceInteractive.
Kursh describes the development of the telephony and speech-based application as a three-way partnership. We designed the call flows and business requirements, and set up the speech application using SpeechWorks’ development kit, he says. We worked with SpeechWorks to set up the initial systems, and now PriceInteractive is doing more on the development end, to refine and expand the system.
MyBeanCounter’s business relationship with PriceInteractive is short on up-front fees. We pay them a monthly fee to host, plus phone minutes, says Kursh. Those rates vary according to volume. MyBeanCounter and PriceInteractive decline to give specific pricing or comment on the number of system users.
Keeping Tabs on Student Workers
Time tracking was also on the mind of Lorraine Capobianco when she decided in 1999 to implement an enterprise speech system at Western Connecticut State University. For years, the university has used timesheets to track the hours of its hundreds of student workers. The timesheets are located in the main offices, says Capobianco, CIO at WCSU, in Danbury, Conn. But the jobs are scattered around campus, so many students make the trip to the main office just once at the end of each two-week pay period to record their actual work hours on timesheets. Because the timesheets are on paper and student access to them is limited, accuracy is a problem, Capobianco says.
Seeking a way to upgrade accuracy, Capobianco considered a 1999 proposal from IBM, with whom the university has a long-term relationship, to implement a self-service system that would enable student workers to enter their hours accurately, using an enterprise voice system and the ubiquitous telephone. Capobianco chose IBM’s WebSphere Voice Server software. Some other voice vendors’ recognition accuracy might be one percentage point better, but WebSphere’s stability makes it the top choice overall, says John Kulhawik, director of Information Systems at WCSU.
The university’s voice-enabled payroll application, which went into pilot in August 2000, links student users to an Oracle time-keeping database running WebSphere Application Server, IBM ViaVoice Pro for speech recognition, and ViaVoice Text-to-Speech Runtime version 5 for synthesis of speech prompts. These apps are running on Windows NT and IBM Netfinity servers at the Danbury campus.
Because the payroll application is telephone-based, student workers can reach it whether their jobs are at the gym, in the cafeteria, or in the library stacks. At the beginning and end of each shift, the student punches in or out by calling into the voice application and following the prompts. Automatic time stamps help to keep workers honest about their hours.
Capobianco and Kulhawik are pleased with IBM’s WebSphere Studio version 3 tools. VoiceXML really does ease the development process, says Kulhawik. In fact, the university’s information technology department has used students to do some of the development work. We’ll be able to use a lot of the same logic to port the voice [student] payroll application to the Web, Kulhawik adds. This will enable supervisors to review student workers’ time records online, for example.
About 40 student workers are now using the pilot voice application; the system is scheduled to roll out to 300 students when it goes live in spring 2001. That sounds like a timely graduation gift for the IT folks and the payroll department. //
John Rossheim writes about speech technologies, travel, and free-agent careers. He can be reached at email@example.com.