Using Multiple TTS Streams On The Emacspeak Audio Desktop
1 Executive Summary
Emacspeak now uses multiple text-to-speech streams — as an example, this enables spoken notifications that do not interrupt ongoing spoken output. To make such notifications more perceivable, Emacspeak places notifications to the right of the user by leveraging Linux-ALSA features that allow one to scale the amplitude of the left and right audio channels.
2 Background
Until now, Emacspeak has used a single instance of a Text-To-Speech (TTS) engine to produce all spoken feedback. An unfortunate consequence is that any spoken announcement necessarily interrupts ongoing speech; as an example, an incoming instant-message (e.g., Jabber notification) can interrupt what you're currently reading.
Emacs itself produces a large number of asynchronous messages
depending on the number of processes running within Emacs; at present,
all Emacs generated messages are equal though there are ongoing
plans to improve this situation going forward, e.g., using package
alert
. With Emacspeak now able to use multiple TTS streams, arrival
of package alert
within Emacs should facilitate smarter handling of
different categories of messages over time.
Playing multiple TTS streams simultaneously can make it hard to
understand the resulting output; Emacspeak leverages underlying ALSA
functionality to send notifications to a virtual ALSA device that
places the auditory output mostly on the right channel. See the
following paragraphs on setup/configuration. I'm presently using this
on Linux with the linux-outloud
voice — you need to have a copy of
this TTS engine installed and working — see Voxin for details on
obtaining that engine. Note: the Emacspeak espeak
server does not
use raw ALSA for its output — consequently, notifications produced
by espeak
play on both left and right channels, making it
impossible to understand. The mac
server may be able to support
this functionality using something Mac-specific — patches welcome.
3 Emacspeak Setup
- Emacspeak now adds user-option
emacspeak-tts-use-notify-stream
. If this is set tot
in the user's initialization file before Emacspeak is loaded, Emacspeak checks to see if the user's selected TTS engine supports multiple instances, and if so launches a second instance of the TTS engine for use as a Notification TTS Stream. See mytvr/emacs-startup.el
in the Emacspeak Git Repository for an example setup. - The Notification TTS Stream can be restarted via command
dtk-notify-initialize
bound toC-e d C-n
. You should ordinarily not need to invoke this command. - The Notification TTS Stream can be shut-down using command
dtk-notify-shutdown
bound toC-e d C-s
. When the /Notification TTS Stream is not available, Emacspeak defaults to using a single TTS stream for all spoken output — i.e., no change. - At present, emacspeak tries to use a separate Notification TTS Stream when the selected TTS engine is a software TTS running locally.
- File
servers/linux-outloud/notify-asoundrc
contains the.asoundrc
that I am using on my thinkpad. To have Emacspeak place the Notification TTS Stream mostly on the right, the contents of that file (suitably modified for your sound card) need to be placed in file$HOME/.asoundrc
. Warning: Handle with care — a broken.asoundrc
can kill all audio output. - The
.asoundrc
scales left and right amplitude to place the output mostly on the right — to change this behavior, you can edit the Transformation Table for virtual devicetts_mono
in the.asoundrc
file. - This set-up has not been tested with
pulseaudio
.
4 Summary
Share and enjoy —