[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Let Linux speak (LINUX JOURNAL Jan. 1997)



January 1997 edition of LINUX JOURNAL holds an 
interesting article about setting up a speech 
server to make Linux speak, written by David Sugar.


[advertising on]
Linux Journal is a fast developing monthly 
magazine which offers help for Linux beginners, 
tips & tricks for the daily ride on your Linux 
computer, introductions of new software available 
for Linux and articles which throw a light on the 
guts of the OS. So with LINUX JOURNAL, you really 
are sitting in the first row of the operating
system theater.

LINUX JOURNAL can be reached at http://ssc.com/lj.
[advertising off]


In "Let Linux Speak" David Sugar describes a tool 
which enables your Linux computer to speak. All 
begins, when David finds an ad for a speech synthesizer 
in one of these electronic magazines.

It is a low cost serial-based text-to-speech synthesizer 
using the SPO256-AL2 chip, probably the chip used in 
the Mattel "Speak & Spell" toy. The cost of this chip 
is about 50 USD. He orders one and after few weeks his 
chip arrives.

The board slits into a PC slot only for power supply 
(you also could use a separate power unit) and connects 
to a serial port through a RS-232 connector. 

The board has its own built in speaker and uses an RCA 
jack for the sound input. After little tweaking David 
does a

     echo "Hello, my name is Rochester!" > /dev/ttyS2"

and he gets some, as he describes "harsh sounding 
cybernetic" response. First problem is, that the text 
to speech algorithm handles words only. Numbers are 
spoken as a sequence of digits. So 91 becomes "nine 
one" instead of "ninety one". He solves this problem 
by a lookup table which translates numbers the right 
way.

Second problem is, that the device acts as an text
-to-phonetic speech device. You cannot use control 
chars or escape sequences to influence the production 
of speech. This can be resolved by using alternate 
spellings to generate different phonetic choices.

Since now extensive table substitution is needed, 
David decides to write a driver as a front end for 
the device. This driver should be able to read normal 
text including numbers which should be pronounced as 
numbers and not as digits. David also wants to spell 
numeric constructs as currency amounts, date and time 
fields, percentages and telephone numbers. Also 
Internet stuff like Internet addresses should be 
pronounced correctly.

He builds a server which sits on a TCP socket accepting
connections from user applications. The servers 
pronounces any text received and is also able spelling 
words and single digits when in a special escape mode. 
The TCP connections makes sure, that only one connection 
will be accepted by the server until closed by the client. 
So speak can't be garbled together from multiple sources.

After speech is up and running, David adds other system 
services. He monitors his BBS. Users logging in and out 
are announced by the  device. Also the sysop page now 
can be spoken. Real hot is the new "down" feature which 
can be used as a replacement for the "shutdown" command: 
Reminiscent of Star Trek, down provides a system shutdown 
warning with ten seconds to override, counting off each 
second before execution. Cancellation is possible by 
keying <Break> key. 

He adds incoming email and hourly monitoring of disk 
usage to be spoken. Since a speaking Linux box might 
be annoying at night, he adds a muting schedule to put 
the server to silence at sleeping hours.

David thinks, that the speech device has proven to be 
quite useful. The pronunciation directory can be expanded 
as needed to cover a wider range of words.The server 
might be usable as screen reader for people who are 
blind and want to run their own Linux box.


PS.:
The SPO-256-AL2 text to speech chip described here 
may be purchased trough B.G.Micro, P.O. Box 280298 
Dallas, TX 75228 (214)271-5546. The Computalker lists 
for around USD 50.- as PC card or USD 80.- stand alone 
with power adapter. Chips are available separately 
and the Computalker should be available in kit form.

PPS:
David Sugar, developer of WorldVU, a Bulletin Board 
System (BBS) for Linux, is currently employed as 
director of software engineering for Fortran Corp. 
and uses Linux for commercial telephony development. 
He maintains his own Internet server under Linux and 
may be reached for comment via http://www.tycho.com   

regards 
Hans




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]