Gaffaweb > Love & Anger > 1989-28 > [ Date Index | Thread Index ]
[Date Prev] [Date Next] [Thread Prev] [Thread Next]


Re: Voice processing technology

From: bloch%mandrill@ucsd.edu (Steve Bloch)
Date: 5 Nov 89 06:39:51 GMT
Subject: Re: Voice processing technology
Keywords: FFT, IIR, DASP
Newsgroups: rec.music.gaffa
Organization: University of California, San Diego
References: <3801@ur-cc.UUCP> <4331@blake.acs.washington.edu>
Reply-To: bloch%mandrill.UUCP@ucsd.edu (Steve Bloch)
Sender: nobody%sdcsvax@ucsd.edu
Summary: FFT or IIR



boris@prodigal.psych.rochester.edu writes:
>Ever since I heard Laurie Anderson I've been wondering... how exactly
>do they do this filtering to change the sound of someone's voice, or
>to make it sound like 3 people singing at once...?

Jon Drukman writes:
>Well, there's two kinds of boxes.  The 'vocoder' ...
>
>The harmonizer is a device which takes your voice and electronically
>alters the frequencies in it (how, I'm not exactly sure) to produce a
>harmony line with it. 

Donley describes a sample-and-chop approach and a frequency-dividing
approach.
>I would NOT expect that harmonizers do complete spectral analysis
>on samples in real time -- even the Fairlight doesn't do that!
Of course, the special-purpose FFT chips are getting faster every day.
Let's see... to do it in real time, assuming mono input and say a
24KHz sampling rate, you need to do a 1024-point FFT in 40 msec.
That's within the capabilities of current hardware, I think.  'Course,
you'd only get a precision of 24Hz, which wouldn't be good enough to
produce a clean harmony in the 500-2000Hz range (1-3 quarter-tones).
If you can do a 2048-point FFT in real-time (here you have 80 msec to
do it), the precision becomes 12Hz.  Check the newsgroup comp.dsp for
more accurate answers.

>Any other possibilities?

Back to Jon:
>With the advent of digital sampling technology, all this stuff is now
>a piece of cake, and you can buy cheap boxes to do it.  I have a
>cartridge for my computer which when coupled with appropriate software
>can transform the pitch of any incoming signal.  If you put a digeridu
>into it playing only one note (since that's all they can play) and
>then played a melody on a MIDI keyboard, it would 'play' the melody
>with a digeridu sound.
If I understand this right, it's just straight play-back-faster-or-
slower, which changes the ADSR parameters if you take it very far
(like, more than half an octave or so).

But if all you want to do is echo a voice at a particular pitch (or
several pitches), and you don't want to change what pitch it is too
often, you can do it with an IIR filter, using very short digital
feedback to build resonances at whatever pitches strike your fancy.
I've always assumed that was how Laurie did it, as it's computa-
tionally very easy (to resonate two pitches, you need a four-pole
filter, which only requires three adds and four multiplies per
sample, and an 8086 can do that.)
The only problem is that DESIGNING the IIR filter, figuring out the
coefficients to suit the pitches you want to resonate, takes some
work, and a grasp of complex analysis doesn't hurt.  You don't want
to do it in real-time.

By the way, you notice that whenever Laurie has her voice echoed on a
particular fixed harmony it "rings" for a while?  That's a direct
effect of the IIR ("infinite impulse response" means that technically
it rings forever, but it may drop below audibility in less than a
second).  How long it rings depends on how precise you want your
pitches to be; if you want absolutely perfect tuning, it WILL ring
forever, without attenuating.

Boris writes again:
>Or suggest a good book to read? [or words to that effect]
How about _Digital_Audio_Signal_Processing_(an_Anthology)_, edited by
John Strawn, Wm. Kaufmann 1985?  Or you could type "g comp.dsp".

"Writers are a funny breed -- I should know." -- Jane Siberry

bloch%cs@ucsd.edu