Building with Watson: Advanced audio transcription with Speech to Text

阿新 • • 發佈：2019-01-16

IBM Watson Senior Offering Manager Bhavik Shah discusses the Speech to Text service and the host of recent improvements and new features designed to make it more powerful than ever. He covers the latest enhancements, including language model customization and diarization.

Watson Speech to Text converts audio voice into written text, so apps that use it can transcribe calls in a contact center to identify what is being discussed, when to escalate calls, and to understand content from multiple speakers. You can create voice-controlled applications and customize the model to improve accuracy for the language and content you care about most, such as product names, sensitive subjects, or names of individuals.

The service offers three programming interfaces for transcribing speech to text:

The WebSocket interface provides a single version of the recognize method for transcribing audio
The HTTP REST interface provides HTTP POST versions of the recognize method that transcribe audio with or without establishing a session with the service

The asynchronous HTTP interface provides a non-blocking POST recognitions method for transcribing audio

The language model customization interface lets you improve the accuracy of speech recognition for domains with industry-specific jargon such as medicine or information technology. Once you’ve customized the model, you can use it with your applications to provide customized speech recognition.

Diarization (also known as speaker diarization) is the process of partitioning an input audio stream into separate segments according to the speaker’s identity. The best part of this function is that with Watson, it can occur in real time, meaning your app can use it on live conversations.

Building with Watson: Advanced audio transcription with Speech to Text

Resources for you

Building with Watson: Advanced audio transcription with Speech to Text

Building with Watson: Integrate Tone Analyzer with Conversation

Building with Watson: Streaming data enhanced with PubNub BLOCKS and Conversation

IBM的語音識別（IBM speech to text 語言轉換成文字）

Csharp: speech to text, text to speech in win

使用C#進行語音識別(Speech-to-Text)

IBM Cloud Speech to Text 語音識別

Building with Watson

Building with Watson: Enhance Discovery with relevance training

Building with Watson: Connect the dots in your domain-specific content

Building with Watson: Introduction to Natural Language Understanding

Advanced Web Development with Django Django高階Web開發教程 Lynda課程中文字幕

Building and Documenting Python REST APIs With Flask and Connexion

Build a virtual assistant for iOS with Watson

Hashtags generation and image QA with Watson AI

Building a Smart Air Pressure Sensor with Espruino and Angular

Building a Search-Engine Optimized PWA with Angular

Building Bullet Graphs and Waterfall Charts with Bokeh

Building a CI system for Go, with Jenkins

Building a Repeatable Data Analysis Process with Jupyter Notebooks

Building with Watson: Advanced audio transcription with Speech to Text

Resources for you

相關推薦