Quick start guide

This project relies on fluteline, an easy to use thread based pipelines library (it’s highly recommended that you check out its docs). It supports the creation of speech-to-text pipelines using easy to use modular components. First, instantiate the nodes that you want to use. Assuming source and destination are two such instantiated nodes, connect them with source.connect(destination). Then, start your nodes with the .start method. When processing is done turn off the nodes with the .stop method.

The transcriber

class watson_streaming.Transcriber(settings, credentials_file=None, apikey=None, hostname=None)

A fluteline consumer-producer.

Send audio samples to it (1 channel, 44100kHz, 16bit, little-endian) and it will spit out the results from watson.

Parameters:
  • settings (dict) – IBM Watson settings. Consult the official IBM Watson docs for more information.
  • credentials_file (string) – Path to your IBM Watson credentials. Alternatively, provide an apikey and hostname.
  • apikey (string) – API key for the IBM Watson service.
  • hostname (string) – IBM Watson hostname.

Utilities

Convenient fluteline producers and consumers to use with the main watson_streaming.Transcriber.

class watson_streaming.utilities.FileAudioGen(audio_file)

Producer that spits out audio samples from a file.

Parameters:audio_file (string) – Path to a .wav file.
class watson_streaming.utilities.MicAudioGen(*args, **kwargs)

Producer that spits out audio samples from your microphone.

class watson_streaming.utilities.Printer(*args, **kwargs)

End-of-chain consumer to print the transcript received from IBM Watson.

Examples

The two examples bellow (copied from here) can help you understand how to use the library for your needs. The first one is for transcribing audio from the microphone using watson_streaming.utilities.MicAudioGen. The 2nd example is similar, but transcribes audio from a file instead, using watson_streaming.utilities.FileAudioGen.

'''
Speech to text transcription, from your mike, in real-time, using IBM Watson.
'''

import argparse
import time

import fluteline

import watson_streaming
import watson_streaming.utilities


def parse_arguments():
    parser = argparse.ArgumentParser(description=__doc__)
    parser.add_argument('credentials', help='path to credentials.json')
    return parser.parse_args()


def main():
    args = parse_arguments()
    settings = {
        'inactivity_timeout': -1,  # Don't kill me after 30 seconds
        'interim_results': True,
    }

    nodes = [
        watson_streaming.utilities.MicAudioGen(),
        watson_streaming.Transcriber(settings, args.credentials),
        watson_streaming.utilities.Printer(),
    ]

    fluteline.connect(nodes)
    fluteline.start(nodes)

    try:
        while True:
            time.sleep(10)
    except KeyboardInterrupt:
        pass
    finally:
        fluteline.stop(nodes)


if __name__ == '__main__':
    main()
'''
Speech to text transcription, from an audio file, in real-time, using
IBM Watson.
'''

import argparse
import contextlib
import time
import wave

import fluteline

import watson_streaming
import watson_streaming.utilities


def parse_arguments():
    parser = argparse.ArgumentParser(description=__doc__)
    parser.add_argument('credentials', help='path to credentials.json')
    parser.add_argument('audio_file', help='path to .wav audio file')
    return parser.parse_args()


def main():
    args = parse_arguments()
    settings = {
        'interim_results': True,
    }

    nodes = [
        watson_streaming.utilities.FileAudioGen(args.audio_file),
        watson_streaming.Transcriber(settings, args.credentials),
        watson_streaming.utilities.Printer(),
    ]

    fluteline.connect(nodes)
    fluteline.start(nodes)

    try:
        with contextlib.closing(wave.open(args.audio_file)) as f:
            wav_length = f.getnframes() / f.getnchannels() / f.getframerate()
        # Sleep till the end of the file + some seconds slack
        time.sleep(wav_length + 5)
    finally:
        fluteline.stop(nodes)


if __name__ == '__main__':
    main()