Speech to text has become one of the most robust technological advancements within the realm of AI. Next-generation of Speech to Text technology offers higher accuracy and faster results, from inaccurate translations of the limited languages available to faster, more precise speech-to-text translations of various languages, tech giants across the globe are racing to build the next-generation engine. One notable company is IBM’s Watson Speech to Text. Through an intricate network of encoders, neural networks, and decoders, IBM’s Watson Speech to Text becomes the next generation’s leader of language in AI.

In this article, the model of Watson Speech to Text is discussed, along with applications of its benefits, use cases, and language availability. 

User Benefits of Next-Generation Speech to Text:

  • Higher accuracy out of the box. The next-generation engine is 19% more accurate for US English telephony data and as high as 57% more accurate for other languages.
  • Spend less time customizing. Not only is the next-generation engine more accurate, but it can transcribe words never seen in training. That means the model is more flexible, so you can spend less time customizing.
  • Get results faster: The next-generation engine analyzes audio with a higher throughput. That means you’ll receive your transcriptions more quickly.

Use Cases:

You can use the next-generation engine for any use case you currently tackle with Watson Speech to Text. Here are some examples.

Virtual Assistant on the Phone

Imagine eliminating hold times and improving customer satisfaction at the same time. The Watson Assistant phone integration enables you to do just that. You can provide live support to your customers with the pre-built integration of Speech to Text technology offers higher accuracy and faster results within Watson Assistant, and hand off to agents as needed.

Analyze Customer Calls

With Watson Speech to Text, you can transcribe customer phone calls to uncover patterns and conduct root cause analysis. After you transcribe your audio, you can use Watson Natural Language Understanding or Watson Discover to analyze those transcriptions.

Support Agents

You have the ability to provide real-time information to improve agent efficiency and focus. You can leverage Watson Speech to Text to transcribe live calls, and then use Watson Discovery to automatically surface relevant information so your agent can focus on the customer rather than on the search.

Language Availability and Accuracy Improvements

The following languages are supported:

U.S. English,* British English,* Australian English,* French,* Canadian French, German,* Italian, and Spanish

The asterisk (*) indicates that low latency mode is supported for the language. You should use low latency mode (if available) when you need the shortest possible interval between submitting your audio and receiving your transcription. Low latency mode might be the best option for Watson Assistant phone integration use cases. Test both options to determine what works best for your solution.

All current languages (see the full list here) will be available on the next-generation engine in 2021. Most will be available by the end of June.” To learn more about translation services, check out aiXplain.

Pin It on Pinterest