[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[SpeechIO-99] flow control / festival-talk not working ? (fwd)




Alan W Black is one of the main guys working on festival.
__________________________________________________________________
PGP fingerprint = 03 5B 9B A0 16 33 91 2F  A5 77 BC EE 43 71 98 D4
            darxus@op.net / http://www.op.net/~darxus
                         Far Beyond Reason

---------- Forwarded message ----------
Date: Wed, 11 Aug 1999 18:28:10 -0400 (EDT)
From: awb@drum.speech.cs.cmu.edu
Reply-To: Alan W Black <awb@cs.cmu.edu>
To: Darxus <darxus@op.net>
Subject: flow control / festival-talk not working ?


 Darxus writes on 11 August 1999:
 > 
 > I posted this to the festival-talk list 2 days ago, and it does not appear
 > to have worked.  I'd really like to be able to do the flow control thing.

I didn't see it on festival-talk so it didn't make it, I'll chase that
up but I'll answer you here.

 > To: festival-talk@cstr.ed.ac.ukf
                                ^^^
That'll be the problem it should be uk not ukf

 > Subject: speechd / flow control / etc
 > 
 > 
 > Since 11/27/98, I, and a couple other people have been working on speechd,
 > which creates a file called /dev/speech.  Any text written to this file
 > will be spoken aloud using either festival or rsynth.
 > 
 > Web page is http://SpeechIO.undef.net
 > 
 > I'd love to know what you guys think of this thing.  Like, is it a
 > horrible abomination that does everything wrong w/ festival ?  I know
 > nothing about speech synthesis, so this is entirely possible.

This is a good idea. I think the idea of a speech deamon is good
and I expect others will want to usee tts like that soon.

 > It was created to be an extreemly easy way to implement speech synthesis
 > support in applications.  A number of people have told me they love the
 > idea.  I've created an ircII (Internet Relay Chat) script that makes use
 > of it.  I've modified a slashdot ticker to support speech (just added
 > like, 5 lines of code).  And someone else did support for AOL's Instant
 > Messanger program.  

Cool, I like to to see festival integrated into such things its exactly
what its for

 > speechd is written in perl & opens up a TCP socket connection to the
 > festival server once when it starts.  So it only forks once, for all the
 > programs that write to /dev/speech.  
 > 
 > The /dev/speech file is a FIFO.  
 > 
 > I would like to be able to do some flow control, but nothing that I read
 > off of the socket connection is sent after the speech is finished.
 > 
 > A blind person came into the #SpeechIO IRC channel (on EFNet) yesterday,
 > and I was talking to him about screenreaders.  Apparently he's got a
 > really nice one for DOS, but there isn't anything acceptable for Linux.
 > I'd like to help him out, but I need to be able to do flow control.

Well there is at least one screenreader for linux and 
a projects dedicated to Linxu for the blind (http://leb.net/blinux/),
but they could probably do with some help.  Ultimately what we want to
do is port Emacsspeak to use Festival thus have a complete free
screenreader for the blind.  *Many* blind people use Emacsspeak
but it currently requires a separate hardware box that runs the
DECTalk synthesizer.  I've beein in communciation with the author
of Emacsspeak (who is blind) and we are negotiating it but
it needs so lower level changes to Festiva to make it work
well.

 > Right now, speechd is reading everything in from /dev/speech, and writing
 > it to festival as soon as it gets it.  I want to be able to read a line of
 > text from /dev/speech, write it to festival, wait for a responce from
 > festival say that it has completed speaking what I've submitted, and then
 > repeat (read the next line).  

This is reasonable, and the server/client interface is probably the
way to go as its much easier to control when things happen then.  If
you simply send the line to festival and have festival reading from
stdio in it wont get the end of utterance as it actually needs the
next token beofre it starts synthesizing.  You can fake this
by introdcuing a special word e.g. ___end_of_utterance__ whose
pronunciation is defined to be no phonemes (simply
addit to the lexcion) and insert that in teh stream to Festival
between each line.

 > 
 > Satyanarayana: I believe you wanted an example of a working festival
 > client.  speechd counts as one.  I could probably re-implement (at least
 > some of) festival_client, if anybody would be interested in me doing that.

There's also a trival C one in the distribution (festival/examples/festival_client.c) as well as a Perl one in there too.

 > One other thing -- I would suggest releasing a windows binary, even if you
 > aren't going to support it.  

We considered it but we already get a lot of mail.  At present its
usually interesting as the people who run festival are usually doing
interesting things with it.  If we released a windows binary lots of
people would download it just for TTS and it probably wouldn't work
due to weird setups in their windows environment and then they would
mail us, even if you say there is no support.  None of the support
team use Window regularly and just to tell people "We don't know why
you got GPF" is actually quite a lot of work.  So the lack of a
windows binary is deliberately restriction the sorts of people who are
using the system.

However as I've just moved to Carnegie Mellon University in the US
where I think we might have more time and expertise to support
it we might release windows binary soon (or if I can find some other
group willing to distribute it pp Oregon Graduate Institute did
release an older version for some time.)

Alan

Alan W Black                                email: awb@cs.cmu.edu
Language Technologies Institute             http://www.cs.cmu.edu/~awb/
Carnegie Mellon University                  tel: +1-412-268-6299  
5000 Forbes Ave, Pittsburgh PA, 15213, USA. fax: +1-412-268-6298