>I think windows 98 is the 1st microsoft product capable of playing
>multiple sounds simultaneously, and esd is the 1st common program that
>allows multiple simultaneous sounds under linux.  So multiple simultaneous
>sounds of any kind hasn't gotten a whole lot of attention.

Well, if people are not yet into sync ing .mp3, .wav then how about 
sampling a voice file, extracting a set of phonemes.  Let me think. 
A few years, a decade or so, there were developments of something 
called CELP.  I think it was a modulation scheme for early cell phones

or something like that.  If I remember a little right it stood for
like Codebook Excited Linear Prediction.

I was thinking that a program might be developed that could 
break down a musical song into the music which could be 
encoded into MIDI and a Speech Recognizer could create 
a code book of voice sounds could be used at low bandwith to 
transmit a midi music component and a codebook stream from 
which could be resynthesized  as a text to speech enhanced 
with duration and pitch.

Thus could be created a low bandwidth method of transmitting 
a text to speech synthesizer.  Then by running the text through a 
language translator one could create a matched pair of speech 
streams that could be adjusted to time synchronization.

>From that it would be possible to create a streaming news service 
with stereo two language audio and something like two language 
time synchronized subtitles.  In other words the reader could see 
the highlighted, or bouncing ball, interlinear text at the bottom of
the screen, have something like an automobile accelerator petal 
to speed up and slow down both the visual and the stereo bilingual 
output and hear the two languages, one in each ear and have a 
joystick or steering wheel to shift the volumne to the right channel,
left channel or near equal audio..