Speech Synthesis:
Vowels and Diphthongs
Introduction
Speech synthesis employs a source-filter model of production.
Although originally applied to speech by Gunnar Fant, the
source-filter model extends to sound production in general.
It consists of two phases:
-
The source (also excitation) phase introduces energy into the sound-production system. For vowel sounds
this phase is represented by the vibration of the glottis, which is powered by the lungs.
-
The filter (also resonance) phrase transfers energy to particular frequencies, or frequency regions, at the expense of others.
For vowel sounds this phase is represented by the cavities of the throat and mouth, along with the opening at the lips.
Each of these components of the vocal tract encourages the production of standing waves in specific frequency regions, producing
a bell-shaped peak in the frequency spectrum. These peaks are known as formants.
The opposite of a formant is an antiformant or antiresonance, which
appears as a trough in the frequency spectrum.
For background reading on vowels I direct you to James Kirby's PowerPoint, Phonetic analysis of vowels.
Orchestra
|
|
|
Figure 1 (a): Instrument #101: Buzz1 realizes the excitation phase for vowels, diphthongs, glides, and liquids.
|
|
Figure 1 (b): Instrument #119: RMS1 captures the power envelope from Buzz1 output.
|
|
|
|
Figure 1 (c): Instrument #121: Mouth1 realizes the resonance phase for vowels, diphthongs, glides, and liquids.
|
|
Figure 1 (d): Instrument #199: Rebalance1 restores the power envelope from Buzz1 to
the audio output from Mouth1 and buffers the result out to file.
|
A quartet of instruments is sufficient to synthesize vocal sounds not just for the vowels and diphthongs explored on the present
page, but also for the glides and liquids explored in the next iteration. This quartet of instruments consists of:
-
Instrument #101: Buzz1, which employs the design shown in Figure 1 (a).
-
Instrument #119: RMS1, which employs the design shown in Figure 1 (b)
(Replicating the design developed for Instrument #5: RMS of
NoiseOrch.xml
).
-
Instrument #121: Mouth1, which employs the design shown in Figure 1 (c).
-
Instrument #199: Rebalance1, which employs the design shown in Figure 1 (d)
(Replicating the design developed for Instrument #21: Output of
NoiseOrch.xml
).
Issues entailed during application of filtering to vowel sounds are worked through in
Emulating Oral Resonances, and the insights gained there have been incorporated into
Instrument #101: Mouth1. The main thing to understand about Instrument #101: Mouth1
is that it employs three band-pass resonators to produce low (Unit #3), middle (Unit #4),
and high (Unit #5) formant peaks. Furthermore,
- Unit #3 looks to Contour #3: Formant1 for the low formant frequency.
- Unit #4 looks to Contour #4: Formant2 for the middle formant frequency.
- Unit #5 looks to Contour #5: Formant3 for the high formant frequency.
(The correspondance between unit numbers and contour numbers is coincidental.)
IPA |
Description |
Examples |
F1 | F2 | F3 |
i |
close front unrounded vowel (long E) |
see, heat |
277 | 2208 | 3079 |
y |
close front rounded |
yes |
277 | 1937 | 2232 |
ɨ/ʉ |
close central rounded |
|
277 | 1520 | 2310 |
ɯ |
close back unrounded |
|
277 | 1218 | 2500 |
V1 |
|
|
227 | 845 | 2460 |
u |
close back rounded (long OO) |
boot |
277 | 553 | 2420 |
ɪ |
near-close near-front unrounded (short I) |
hit, sitting |
344 | 2170 | 2660 |
ʏ |
near-close near-front rounded |
|
344 | 1770 | 2230 |
ɨ† |
close central unrounded |
|
344 | 1507 | 2390 |
ɯ† |
close back unrounded |
|
344 | 1228 | 2500 |
ʊ |
near-close near-back (short OO) |
put, could |
344 | 635 | 2413 |
e |
close-mid front unrounded |
met, bed |
414 | 2065 | 2570 |
ø |
close-mid front rounded |
|
414 | 1608 | 2250 |
ə |
mid central (schwa; short U) |
|
414 | 1516 | 2500 |
ɤ |
close-mid back unrounded |
book |
414 | 1238 | 2500 |
o |
close-mid back rounded (long O) |
tow |
414 | 721 | 2406 |
'e' |
|
|
487 | 1928 | 2580 |
'ə'/'θ' |
|
|
487 | 1492 | 2505 |
'ɣ' |
|
|
487 | 1248 | 2500 |
V2 |
|
|
487 | 845 | 2460 |
'o' |
|
|
487 | 815 | 2393 |
ε |
open-mid front unrounded (short E) |
pet |
565 | 1819 | 2528 |
œ |
open-mid front rounded |
|
565 | 1520 | 2500 |
ɜ |
open-mid central unrounded |
|
565 | 1462 | 2500 |
ʌ |
open-mid back unrounded (short U) |
cup, luck |
565 | 1258 | 2500 |
ɔ |
open-mid back rounded |
pot |
565 | 915 | 2373 |
æ |
near-open front unrounded (short A) |
cat, black |
648 | 1712 | 2490 |
ɐ |
near-open central |
cut |
648 | 1405 | 2500 |
ɒ† |
|
|
648 | 1023 | 2500 |
a† |
|
|
735 | 1498 | 2537 |
ɑ |
open back unrounded (short O) |
hot, rock |
735 | 1278 | 2500 |
ɒ |
open back rounded |
|
735 | 1141 | 2280 |
a |
open front unrounded |
arm, father |
800 | 1228 | 2500 |
Table 1: Vowel formants (after
Boë, Valleé, Schwartz and Abry, 2002).
Vowels
V1 and
V2 are unrecognized by the International Phonetic Alphabet (IPA); they fill in the “hole [between] unrounded and rounded back vowels”.
I have no idea why some IPA codes are enclosed in single quotes.
Descriptions from
Wikipedia and
pronunciationcoach.wordpress.com.
Pronunciation examples from
webdesign.about.com and
antimoon.com.
Vowels
After searching the internet for tables of formant frequencies, I settled on adapting the one by Boë, Valleé, Schwartz and Abry
presented in Table 1.
This table is particularly suitable for the current purpose because it fills in “holes” not officially recognized
by the IPA. Such unofficial values might be regarded as a fault to a phoneticist, but our purpose here is to explore the diversity of
vocal sounds.
Understand that the formant values presented in Table 1 represent a male speaker; the values will be proportionately higher for
women and children. Table II on page 183 of Peterson and Barney compares vowel formants
for men, women, and children. I analyzed Peterson and Barney's data and obtained an average frequency shift of 116% between formant values for women
(numerator) and corresponding formant values for men (denominator). For children the average frequency shift was around 135%.
The IPA chart reproduced in Figure 2 places vowels in context:
-
The central position in the chart is occupied by the neutral vowel
ə
, known as the “schwa”.
-
The vertical dimension of the chart is based on F1.
It ranges along a subjective scale from “Close” to “Open”.
According to Cox, “The frequency of F1 appears to be related to lip rounding, i.e. low F1 = lip round.”
-
The horizontal dimension of the chart is based on F2.
It ranges along a subjective scale from “Front” through “Central” to “Back”.
|
Figure 3 (a): Frequency response for vowel a (as in father) with formants at
800, 1228, and 2500 Hz.
|
|
Figure 3 (b): Frequency response for vowel æ (as in cat) with formants at
648, 1712, and 2490 Hz.
|
|
Figure 3 (c): Frequency response for vowel i (as in see) with formants at
277, 2208, and 3079 Hz.
|
|
Figure 3 (d): Frequency response for vowel o (as in tow) with formants at
414, 721, and 2406 Hz.
|
|
Figure 3 (e): Frequency response for vowel u (as in boot) with formants at
277, 553, and 2420 Hz.
|
The frequency-response graphs in Figures 3 (a), (b), (c),
(d), and (e) where created by sweeping a sine tone linearly from 110 to 3500 Hz.
and processing the result through three cascaded resonators using the indicated formant peaks.
A power envelope was then extracted using the RMS unit and the logarithm of the result was plotted.
IPA | Description | Example |
eɪ | long A | cake |
aɪ | long I | kite |
oʊ | long O | rope |
ju | long U | cue |
æʊ | OU (AW) | house |
ɔʊ | OY | boy |
Table 2: Familiar English diphthongs.
Diphthongs
Diphthongs happen when two vowels occur consecutively in a word.
The transition from the source vowel to the target vowel is gradual, taking around 300 msec.
One site, pronunciationcoach.wordpress.com takes the position that all
sounds designated in English as “long vowels” are actually diphthongs. On that basis, the same site contends
that two additional diphthongs, AW and OY, should additionally be recognized as vowels.
Table 2 transcribes these six sounds (four long vowels plus AW and OY) into the International Phonetic Alphabet, care of
learn-foreign-language-phonetics.com.
Spectrograms of diphthongs in progress are provided by James Kirby's PowerPoint, “Spectral features of vowels; spectrograms”.
orch /Users/charlesames/Scratch/SpeechOrch.xml
orch /Users/charlesames/Documents/Sound/SpeechOrch.xml
set rate 44100
set bits 16
set norm 1
name Daisy2
// Dai-
ramp 1 1 0.00 0.20 6000 6000 // Amplitude
ramp 1 3 0.00 0.20 414 414 // Formant 1
ramp 1 4 0.00 0.20 2065 2065 // Formant 2
ramp 1 5 0.00 0.20 2570 2570 // Formant 3
ramp 1 1 0.20 0.30 6000 6000 // Amplitude
ramp 1 3 0.20 0.30 414 344 // Formant 1
ramp 1 4 0.20 0.30 2065 2170 // Formant 2
ramp 1 5 0.20 0.30 2570 2660 // Formant 3
ramp 1 1 0.50 2.50 6000 6000 // Amplitude
ramp 1 3 0.50 2.50 344 344 // Formant 1
ramp 1 4 0.50 2.50 2170 2170 // Formant 2
ramp 1 5 0.50 2.50 2660 2660 // Formant 3
ramp 1 2 0.00 2.98 293.7 293.7 // D4
ramp 1 2 2.98 0.02 293.7 246.9 // D4-B3
note 1 1 101 0 0.00 3.00 1 0.03 // Buzz1
// -sy
ramp 1 2 3.00 3.00 246.9 246.9 // B3
ramp 1 1 3.00 2.00 6000 6000 // Amplitude
ramp 1 3 3.00 2.00 277 277 // Formant 1
ramp 1 4 3.00 2.00 2208 2208 // Formant 2
ramp 1 5 3.00 2.00 3079 3079 // Formant 3
note 2 1 101 1 3.00 2.00 1 0 // Buzz1
note 3 1 119 0 0.00 5.00 0.1 // RMS1
note 4 1 121 0 0.00 5.00 // Mouth1
note 5 1 199 0 0.00 5.00 // Rebalance1
// (rest)
ramp 1 1 5.00 1.00 6000 6000 // Amplitude
ramp 1 3 5.00 1.00 277 414 // Formant 1
ramp 1 4 5.00 1.00 2208 2065 // Formant 2
ramp 1 5 5.00 1.00 3079 2570 // Formant 3
// Dai-
ramp 1 2 6.00 2.98 196 196 // G3
ramp 1 2 8.98 0.02 196 146.8 // G3-D3
ramp 1 1 6.00 0.20 6000 6000 // Amplitude
ramp 1 3 6.00 0.20 414 414 // Formant 1
ramp 1 4 6.00 0.20 2065 2065 // Formant 2
ramp 1 5 6.00 0.20 2570 2570 // Formant 3
ramp 1 1 6.20 0.30 6000 6000 // Amplitude
ramp 1 3 6.20 0.30 414 344 // Formant 1
ramp 1 4 6.20 0.30 2065 2170 // Formant 2
ramp 1 5 6.20 0.30 2570 2660 // Formant 3
ramp 1 1 6.50 2.50 6000 6000 // Amplitude
ramp 1 3 6.50 2.50 344 344 // Formant 1
ramp 1 4 6.50 2.50 2170 2170 // Formant 2
ramp 1 5 6.50 2.50 2660 2660 // Formant 3
note 6 1 101 0 6.00 3.00 1 0.03 // Buzz1
// -sy
ramp 1 2 9.00 3.00 146.8 146.8 // D3
ramp 1 1 9.00 2.00 6000 6000 // Amplitude
ramp 1 3 9.00 2.00 277 277 // Formant 1
ramp 1 4 9.00 2.00 2208 2208 // Formant 2
ramp 1 5 9.00 2.00 3079 3079 // Formant 3
note 7 1 101 6 9.00 2.00 1 0 // Buzz1
note 8 1 119 0 6.00 5.00 0.1 // RMS1
note 9 1 121 0 6.00 5.00 // Mouth1
note 10 1 199 0 6.00 5.00 // Rebalance1
// (rest)
ramp 1 1 11.00 1.00 6000 6000 // Amplitude
ramp 1 3 11.00 1.00 277 344 // Formant 1
ramp 1 4 11.00 1.00 2208 2170 // Formant 2
ramp 1 5 11.00 1.00 3079 2660 // Formant 3
// Give
ramp 1 2 12.00 1.00 164.8 164.8 // E3
ramp 1 1 12.00 1.00 6000 6000 // Amplitude
ramp 1 3 12.00 1.00 344 344 // Formant 1
ramp 1 4 12.00 1.00 2170 2170 // Formant 2
ramp 1 5 12.00 1.00 2660 2660 // Formant 3
note 11 1 101 0 12.00 0.90 1 0.03 // Buzz1
note 12 1 119 0 12.00 0.90 0.1 // RMS1
note 13 1 121 0 12.00 0.90 // Mouth1
note 14 1 199 0 12.00 0.90 // Rebalance1
// me
ramp 1 2 13.00 1.00 185 185 // F#3
ramp 1 1 13.00 1.00 6000 6000 // Amplitude
ramp 1 3 13.00 1.00 277 277 // Formant 1
ramp 1 4 13.00 1.00 2208 2208 // Formant 2
ramp 1 5 13.00 1.00 3079 3079 // Formant 3
note 15 1 101 0 13.00 0.90 1 0.03 // Buzz1
note 16 1 119 0 13.00 0.90 // RMS1
note 17 1 121 0 13.00 0.90 // Mouth1
note 18 1 199 0 13.00 0.90 // Rebalance1
// your
ramp 1 2 14.00 1.00 196 196 // G3
ramp 1 1 14.00 1.00 6000 6000 // Amplitude
ramp 1 3 14.00 1.00 414 414 // Formant 1
ramp 1 4 14.00 1.00 1516 1516 // Formant 2
ramp 1 5 14.00 1.00 2500 2500 // Formant 3
note 19 1 101 0 14.00 0.90 1 0.03 // Buzz1
note 20 1 119 0 14.00 0.90 0.1 // RMS1
note 21 1 121 0 14.00 0.90 // Mouth1
note 22 1 199 0 14.00 0.90 // Rebalance1
// an-
ramp 1 2 15.00 1.98 164.8 164.8 // E3
ramp 1 2 16.98 0.02 164.8 196 // E3-G3
ramp 1 1 15.00 2.00 6000 6000 // Amplitude
ramp 1 3 15.00 2.00 648 648 // Formant 1
ramp 1 4 15.00 2.00 1712 1712 // Formant 2
ramp 1 5 15.00 2.00 2490 2490 // Formant 3
note 23 1 101 0 15.00 2.00 1 0.03 // Buzz1
// -swer
ramp 1 2 17.00 1.00 196 196 // G3
ramp 1 1 17.00 1.00 6000 6000 // Amplitude
ramp 1 3 17.00 1.00 414 414 // Formant 1
ramp 1 4 17.00 1.00 1516 1516 // Formant 2
ramp 1 5 17.00 1.00 2500 2500 // Formant 3
note 24 1 101 23 17.00 0.90 1 0 // Buzz1
note 25 1 119 0 15.00 2.90 // RMS1
note 26 1 121 0 15.00 2.90 // Mouth1
note 27 1 199 0 15.00 2.90 // Rebalance1
// do
ramp 1 2 18.00 6.00 146.8 146.8 // D3
ramp 1 1 18.00 3.00 6000 6000 // Amplitude
ramp 1 3 18.00 3.00 277 277 // Formant 1
ramp 1 4 18.00 3.00 553 553 // Formant 2
ramp 1 5 18.00 3.00 2420 2420 // Formant 3
note 28 1 101 0 18.00 3.00 1 0.03 // Buzz1
note 29 1 119 0 18.00 3.00 0.1 // RMS1
note 30 1 121 0 18.00 3.00 // Mouth1
note 31 1 199 0 18.00 3.00 // Rebalance1
// (rest)
ramp 1 1 21.00 3.00 6000 6000 // Amplitude
ramp 1 3 21.00 3.00 277 800 // Formant 1
ramp 1 4 21.00 3.00 553 1228 // Formant 2
ramp 1 5 21.00 3.00 2420 2500 // Formant 3
// I'm
ramp 1 2 24.00 3.00 220 220 // A3
ramp 1 1 24.00 0.30 6000 6000 // Amplitude
ramp 1 3 24.00 0.30 800 800 // Formant 1
ramp 1 4 24.00 0.30 1228 1228 // Formant 2
ramp 1 5 24.00 0.30 2500 2500 // Formant 3
ramp 1 1 24.30 0.30 6000 6000 // Amplitude
ramp 1 3 24.30 0.30 800 344 // Formant 1
ramp 1 4 24.30 0.30 1228 2170 // Formant 2
ramp 1 5 24.30 0.30 2500 2660 // Formant 3
ramp 1 1 24.60 2.40 6000 6000 // Amplitude
ramp 1 3 24.60 2.40 344 344 // Formant 1
ramp 1 4 24.60 2.40 2170 2170 // Formant 2
ramp 1 5 24.60 2.40 2660 2660 // Formant 3
note 32 1 101 0 24.00 2.90 1 0.03 // Buzz1
note 33 1 119 0 24.00 2.90 // RMS1
note 34 1 121 0 24.00 2.90 // Mouth1
note 35 1 199 0 24.00 2.90 // Rebalance1
// half
ramp 1 2 27.00 3.00 293.7 293.7 // D4
ramp 1 1 27.00 3.00 6000 6000 // Amplitude
ramp 1 3 27.00 3.00 800 800 // Formant 1
ramp 1 4 27.00 3.00 1228 1228 // Formant 2
ramp 1 5 27.00 3.00 2500 2500 // Formant 3
note 36 1 101 0 27.00 3.00 1 0.03 // Buzz1
note 37 1 119 0 27.00 3.00 0.1 // RMS1
note 38 1 121 0 27.00 3.00 // Mouth1
note 39 1 199 0 27.00 3.00 // Rebalance1
// cra-
ramp 1 2 30.00 1.90 246.9 246.9 // B3
ramp 1 2 31.90 0.10 246.9 220 // B3-A3
ramp 1 2 32.00 0.90 220 220 // A3
ramp 1 2 32.90 0.10 220 196 // A3-G3
ramp 1 1 30.00 1.85 6000 6000 // Amplitude
ramp 1 3 30.00 1.85 414 414 // Formant 1
ramp 1 4 30.00 1.85 2065 2065 // Formant 2
ramp 1 5 30.00 1.85 2570 2570 // Formant 3
ramp 1 1 31.85 0.15 6000 6000 // Amplitude
ramp 1 3 31.85 0.15 414 344 // Formant 1
ramp 1 4 31.85 0.15 2065 2170 // Formant 2
ramp 1 5 31.85 0.15 2570 2660 // Formant 3
ramp 1 1 32.00 0.85 6000 6000 // Amplitude
ramp 1 3 32.00 0.85 344 344 // Formant 1
ramp 1 4 32.00 0.85 2170 2170 // Formant 2
ramp 1 5 32.00 0.85 2660 2660 // Formant 3
ramp 1 1 32.85 0.15 6000 6000 // Amplitude
ramp 1 3 32.85 0.15 344 277 // Formant 1
ramp 1 4 32.85 0.15 2170 2208 // Formant 2
ramp 1 5 32.85 0.15 2660 3079 // Formant 3
note 40 1 101 0 30.00 3.00 1 0.03 // Buzz1
// -zy
ramp 1 2 33.00 2.00 196 196 // G3
ramp 1 1 33.00 2.00 6000 6000 // Amplitude
ramp 1 3 33.00 2.00 277 277 // Formant 1
ramp 1 4 33.00 2.00 2208 2208 // Formant 2
ramp 1 5 33.00 2.00 3079 3079 // Formant 3
note 41 1 101 40 33.00 1.90 1 0 // Buzz1
note 42 1 119 0 30.00 4.90 0.1 // RMS1
note 43 1 121 0 30.00 4.90 // Mouth1
note 44 1 199 0 30.00 4.90 // Rebalance1
// and
ramp 1 2 35.00 1.00 185 185 // F#3
ramp 1 1 35.00 1.00 6000 6000 // Amplitude
ramp 1 3 35.00 1.00 414 414 // Formant 1
ramp 1 4 35.00 1.00 1516 1516 // Formant 2
ramp 1 5 35.00 1.00 2500 2500 // Formant 3
note 45 1 101 0 35.00 0.90 1 0.03 // Buzz1
note 46 1 119 0 35.00 0.90 // RMS1
note 47 1 121 0 35.00 0.90 // Mouth1
note 48 1 199 0 35.00 0.90 // Rebalance1
// all
ramp 1 2 36.00 1.00 164.8 164.8 // E3
ramp 1 1 36.00 1.00 6000 6000 // Amplitude
ramp 1 3 36.00 1.00 800 800 // Formant 1
ramp 1 4 36.00 1.00 1228 1228 // Formant 2
ramp 1 5 36.00 1.00 2500 2500 // Formant 3
note 49 1 101 0 36.00 0.90 1 0.03 // Buzz1
note 50 1 119 0 36.00 0.90 0.1 // RMS1
note 51 1 121 0 36.00 0.90 // Mouth1
note 52 1 199 0 36.00 0.90 // Rebalance1
// for
ramp 1 2 37.00 1.00 185 185 // F#3
ramp 1 1 37.00 1.00 6000 6000 // Amplitude
ramp 1 3 37.00 1.00 414 414 // Formant 1
ramp 1 4 37.00 1.00 721 721 // Formant 2
ramp 1 5 37.00 1.00 2406 2406 // Formant 3
note 53 1 101 0 37.00 0.90 1 0.1 // Buzz1
note 54 1 119 0 37.00 0.90 0.1 // RMS1
note 55 1 121 0 37.00 0.90 // Mouth1
note 56 1 199 0 37.00 0.90 // Rebalance1
// the
ramp 1 2 38.00 1.00 196 196 // G3
ramp 1 1 38.00 1.00 6000 6000 // Amplitude
ramp 1 3 38.00 1.00 414 414 // Formant 1
ramp 1 4 38.00 1.00 1516 1516 // Formant 2
ramp 1 5 38.00 1.00 2500 2500 // Formant 3
note 57 1 101 0 38.00 0.90 1 0.03 // Buzz1
note 58 1 119 0 38.00 0.90 0.1 // RMS1
note 59 1 121 0 38.00 0.90 // Mouth1
note 60 1 199 0 38.00 0.90 // Rebalance1
// love
ramp 1 2 39.00 2.00 220 220 // A3
ramp 1 1 39.00 2.00 6000 6000 // Amplitude
ramp 1 3 39.00 2.00 414 414 // Formant 1
ramp 1 4 39.00 2.00 1516 1516 // Formant 2
ramp 1 5 39.00 2.00 2500 2500 // Formant 3
note 61 1 101 0 39.00 1.90 1 0.03 // Buzz1
note 62 1 119 0 39.00 1.90 0.1 // RMS1
note 63 1 121 0 39.00 1.90 // Mouth1
note 64 1 199 0 39.00 1.90 // Rebalance1
// of
ramp 1 2 41.00 1.00 246.9 246.9 // B3
ramp 1 1 41.00 0.45 6000 6000 // Amplitude
ramp 1 3 41.00 0.45 414 414 // Formant 1
ramp 1 4 41.00 0.45 1516 1516 // Formant 2
ramp 1 5 41.00 0.45 2500 2500 // Formant 3
ramp 1 1 41.45 0.05 6000 6000 // Amplitude
ramp 1 3 41.45 0.05 414 414 // Formant 1
ramp 1 4 41.45 0.05 1516 721 // Formant 2
ramp 1 5 41.45 0.05 2500 2406 // Formant 3
ramp 1 1 41.50 0.50 6000 6000 // Amplitude
ramp 1 3 41.50 0.50 414 414 // Formant 1
ramp 1 4 41.50 0.50 721 721 // Formant 2
ramp 1 5 41.50 0.50 2406 2406 // Formant 3
note 65 1 101 0 41.00 0.90 1 0.03 // Buzz1
note 66 1 119 0 41.00 0.90 0.1 // RMS1
note 67 1 121 0 41.00 0.90 // Mouth1
note 68 1 199 0 41.00 0.90 // Rebalance1
// you
ramp 1 2 42.00 5.00 220 220 // A3
ramp 1 1 42.00 3.00 6000 6000 // Amplitude
ramp 1 3 42.00 3.00 277 277 // Formant 1
ramp 1 4 42.00 3.00 553 553 // Formant 2
ramp 1 5 42.00 3.00 2420 2420 // Formant 3
note 69 1 101 0 42.00 2.90 1 0.03 // Buzz1
note 70 1 119 0 42.00 2.90 0.1 // RMS1
note 71 1 121 0 42.00 2.90 // Mouth1
note 72 1 199 0 42.00 2.90 // Rebalance1
// (rest)
ramp 1 1 45.00 2.00 6000 6000 // Amplitude
ramp 1 3 45.00 2.00 277 344 // Formant 1
ramp 1 4 45.00 2.00 553 2170 // Formant 2
ramp 1 5 45.00 2.00 2420 2660 // Formant 3
// It
ramp 1 2 47.00 1.00 246.9 246.9 // B3
ramp 1 1 47.00 1.00 6000 6000 // Amplitude
ramp 1 3 47.00 1.00 344 344 // Formant 1
ramp 1 4 47.00 1.00 2170 2170 // Formant 2
ramp 1 5 47.00 1.00 2660 2660 // Formant 3
note 73 1 101 0 47.00 0.50 1 0.03 // Buzz1
note 74 1 119 0 47.00 0.90 0.1 // RMS1
note 75 1 121 0 47.00 0.90 // Mouth1
note 76 1 199 0 47.00 0.90 // Rebalance1
// won't
ramp 1 2 48.00 1.00 261.6 261.6 // C4
ramp 1 1 48.00 0.20 6000 6000 // Amplitude
ramp 1 3 48.00 0.20 414 414 // Formant 1
ramp 1 4 48.00 0.20 721 721 // Formant 2
ramp 1 5 48.00 0.20 2406 2406 // Formant 3
ramp 1 1 48.20 0.30 6000 6000 // Amplitude
ramp 1 3 48.20 0.30 414 344 // Formant 1
ramp 1 4 48.20 0.30 721 635 // Formant 2
ramp 1 5 48.20 0.30 2406 2413 // Formant 3
ramp 1 1 48.50 0.50 6000 6000 // Amplitude
ramp 1 3 48.50 0.50 344 344 // Formant 1
ramp 1 4 48.50 0.50 635 635 // Formant 2
ramp 1 5 48.50 0.50 2413 2413 // Formant 3
note 77 1 101 0 48.00 0.90 1 0.03 // Buzz1
note 78 1 119 0 48.00 0.90 // RMS1
note 79 1 121 0 48.00 0.90 // Mouth1
note 80 1 199 0 48.00 0.90 // Rebalance1
// be
ramp 1 2 49.00 1.00 246.9 246.9 // B3
ramp 1 1 49.00 1.00 6000 6000 // Amplitude
ramp 1 3 49.00 1.00 344 344 // Formant 1
ramp 1 4 49.00 1.00 2170 2170 // Formant 2
ramp 1 5 49.00 1.00 2660 2660 // Formant 3
note 81 1 101 0 49.00 0.90 1 0.03 // Buzz1
note 82 1 119 0 49.00 0.90 0.1 // RMS1
note 83 1 121 0 49.00 0.90 // Mouth1
note 84 1 199 0 49.00 0.90 // Rebalance1
// a
ramp 1 2 50.00 1.00 220 220 // A4
ramp 1 1 50.00 1.00 6000 6000 // Amplitude
ramp 1 3 50.00 1.00 414 414 // Formant 1
ramp 1 4 50.00 1.00 1516 1516 // Formant 2
ramp 1 5 50.00 1.00 2500 2500 // Formant 3
note 85 1 101 0 50.00 0.90 1 0.03 // Buzz1
note 86 1 119 0 50.00 0.90 0.1 // RMS1
note 87 1 121 0 50.00 0.90 // Mouth1
note 88 1 199 0 50.00 0.90 // Rebalance1
// sty-
ramp 1 2 51.00 1.98 293.7 293.7 // D4
ramp 1 2 52.98 0.02 293.7 246.9 // D4-B3
ramp 1 1 51.00 0.30 6000 6000 // Amplitude
ramp 1 3 51.00 0.30 800 800 // Formant 1
ramp 1 4 51.00 0.30 1228 1228 // Formant 2
ramp 1 5 51.00 0.30 2500 2500 // Formant 3
ramp 1 1 51.30 0.30 6000 6000 // Amplitude
ramp 1 3 51.30 0.30 800 277 // Formant 1
ramp 1 4 51.30 0.30 1228 2208 // Formant 2
ramp 1 5 51.30 0.30 2500 3079 // Formant 3
ramp 1 1 51.60 1.40 6000 6000 // Amplitude
ramp 1 3 51.60 1.40 277 277 // Formant 1
ramp 1 4 51.60 1.40 2208 2208 // Formant 2
ramp 1 5 51.60 1.40 3079 3079 // Formant 3
note 89 1 101 0 51.00 2.00 1 0.03 // Buzz1
// -lish
ramp 1 2 53.00 1.00 246.9 246.9 // B3
ramp 1 1 53.00 1.00 6000 6000 // Amplitude
ramp 1 3 53.00 1.00 344 344 // Formant 1
ramp 1 4 53.00 1.00 2170 2170 // Formant 2
ramp 1 5 53.00 1.00 2660 2660 // Formant 3
note 90 1 101 89 53.00 0.90 1 0 // Buzz1
note 91 1 119 0 51.00 2.90 0.1 // RMS1
note 92 1 121 0 51.00 2.90 // Mouth1
note 93 1 199 0 51.00 2.90 // Rebalance1
// mar-
ramp 1 2 54.00 0.98 220 220 // A3
ramp 1 2 54.98 0.02 220 196 // A3-G3
ramp 1 1 54.00 1.00 6000 6000 // Amplitude
ramp 1 3 54.00 1.00 414 414 // Formant 1
ramp 1 4 54.00 1.00 2065 2065 // Formant 2
ramp 1 5 54.00 1.00 2570 2570 // Formant 3
note 94 1 101 0 54.00 1.00 1 0.03 // Buzz1
// -riage
ramp 1 2 55.00 4.00 196 196 // G3
ramp 1 1 55.00 2.00 6000 6000 // Amplitude
ramp 1 3 55.00 2.00 344 344 // Formant 1
ramp 1 4 55.00 2.00 2170 2170 // Formant 2
ramp 1 5 55.00 2.00 2660 2660 // Formant 3
note 95 1 101 94 55.00 1.90 1 0 // Buzz1
note 96 1 119 0 54.00 2.90 // RMS1
note 97 1 121 0 54.00 2.90 // Mouth1
note 98 1 199 0 54.00 2.90 // Rebalance1
// (rest)
ramp 1 1 57.00 2.00 6000 6000 // Amplitude
ramp 1 3 57.00 2.00 344 800 // Formant 1
ramp 1 4 57.00 2.00 2170 1228 // Formant 2
ramp 1 5 57.00 2.00 2660 2500 // Formant 3
// I
ramp 1 2 59.00 1.00 220 220 // A3
ramp 1 1 59.00 0.30 6000 6000 // Amplitude
ramp 1 3 59.00 0.30 800 800 // Formant 1
ramp 1 4 59.00 0.30 1228 1228 // Formant 2
ramp 1 5 59.00 0.30 2500 2500 // Formant 3
ramp 1 1 59.30 0.30 6000 6000 // Amplitude
ramp 1 3 59.30 0.30 800 344 // Formant 1
ramp 1 4 59.30 0.30 1228 2170 // Formant 2
ramp 1 5 59.30 0.30 2500 2660 // Formant 3
ramp 1 1 59.60 0.40 6000 6000 // Amplitude
ramp 1 3 59.60 0.40 344 344 // Formant 1
ramp 1 4 59.60 0.40 2170 2170 // Formant 2
ramp 1 5 59.60 0.40 2660 2660 // Formant 3
note 99 1 101 0 59.00 0.90 1 0.03 // Buzz1
note 100 1 119 0 59.00 0.90 0.1 // RMS1
note 101 1 121 0 59.00 0.90 // Mouth1
note 102 1 199 0 59.00 0.90 // Rebalance1
// can't
ramp 1 2 60.00 2.00 246.9 246.9 // B3
ramp 1 1 60.00 2.00 6000 6000 // Amplitude
ramp 1 3 60.00 2.00 648 648 // Formant 1
ramp 1 4 60.00 2.00 1712 1712 // Formant 2
ramp 1 5 60.00 2.00 2490 2490 // Formant 3
note 103 1 101 0 60.00 1.90 1 0.03 // Buzz1
note 104 1 119 0 60.00 1.90 // RMS1
note 105 1 121 0 60.00 1.90 // Mouth1
note 106 1 199 0 60.00 1.90 // Rebalance1
// af-
ramp 1 2 62.00 0.98 196 196 // G3
ramp 1 2 62.98 0.02 196 164.8 // G3-E3
ramp 1 1 62.00 0.95 6000 6000 // Amplitude
ramp 1 3 62.00 0.95 414 414 // Formant 1
ramp 1 4 62.00 0.95 1516 1516 // Formant 2
ramp 1 5 62.00 0.95 2500 2500 // Formant 3
ramp 1 1 62.95 0.05 6000 6000 // Amplitude
ramp 1 3 62.95 0.05 414 565 // Formant 1
ramp 1 4 62.95 0.05 1516 915 // Formant 2
ramp 1 5 62.95 0.05 2500 2373 // Formant 3
note 107 1 101 0 62.00 1.00 1 0.03 // Buzz1
// -ford
ramp 1 2 63.00 2.00 164.8 164.8 // E3
ramp 1 1 63.00 2.00 6000 6000 // Amplitude
ramp 1 3 63.00 2.00 565 565 // Formant 1
ramp 1 4 63.00 2.00 915 915 // Formant 2
ramp 1 5 63.00 2.00 2373 2373 // Formant 3
note 108 1 101 107 63.00 1.90 1 0 // Buzz1
note 109 1 119 0 62.00 2.90 0.1 // RMS1
note 110 1 121 0 62.00 2.90 // Mouth1
note 111 1 199 0 62.00 2.90 // Rebalance1
// a
ramp 1 2 65.00 1.00 196 196 // G3
ramp 1 1 65.00 1.00 6000 6000 // Amplitude
ramp 1 3 65.00 1.00 414 414 // Formant 1
ramp 1 4 65.00 1.00 1516 1516 // Formant 2
ramp 1 5 65.00 1.00 2500 2500 // Formant 3
note 112 1 101 0 65.00 0.90 1 0.03 // Buzz1
note 113 1 119 0 65.00 0.90 0.1 // RMS1
note 114 1 121 0 65.00 0.90 // Mouth1
note 115 1 199 0 65.00 0.90 // Rebalance1
// car-
ramp 1 2 66.00 0.98 164.8 164.8 // E3
ramp 1 2 66.98 0.02 164.8 146.8 // E3-D3
ramp 1 1 66.00 1.00 6000 6000 // Amplitude
ramp 1 3 66.00 1.00 414 414 // Formant 1
ramp 1 4 66.00 1.00 2065 2065 // Formant 2
ramp 1 5 66.00 1.00 2570 2570 // Formant 3
note 116 1 101 0 66.00 1.00 1 0.03 // Buzz1
// -riage
ramp 1 2 67.00 4.00 146.8 146.8 // D3
ramp 1 1 67.00 2.00 6000 6000 // Amplitude
ramp 1 3 67.00 2.00 344 344 // Formant 1
ramp 1 4 67.00 2.00 2170 2170 // Formant 2
ramp 1 5 67.00 2.00 2660 2660 // Formant 3
note 117 1 101 116 67.00 2.00 1 0 // Buzz1
note 118 1 119 0 66.00 3.00 0.1 // RMS1
note 119 1 121 0 66.00 3.00 // Mouth1
note 120 1 199 0 66.00 3.00 // Rebalance1
// (rest)
ramp 1 1 69.00 2.00 6000 6000 // Amplitude
ramp 1 3 69.00 2.00 344 414 // Formant 1
ramp 1 4 69.00 2.00 2170 1516 // Formant 2
ramp 1 5 69.00 2.00 2660 2500 // Formant 3
// But
ramp 1 2 71.00 1.00 146.8 146.8 // D3
ramp 1 1 71.00 1.00 6000 6000 // Amplitude
ramp 1 3 71.00 1.00 414 414 // Formant 1
ramp 1 4 71.00 1.00 1516 1516 // Formant 2
ramp 1 5 71.00 1.00 2500 2500 // Formant 3
note 121 1 101 0 71.00 0.90 1 0.03 // Buzz1
note 122 1 119 0 71.00 0.90 0.1 // RMS1
note 123 1 121 0 71.00 0.90 // Mouth1
note 124 1 199 0 71.00 0.90 // Rebalance1
// you'll
ramp 1 2 72.00 2.00 196 196 // G3
ramp 1 1 72.00 2.00 6000 6000 // Amplitude
ramp 1 3 72.00 2.00 277 277 // Formant 1
ramp 1 4 72.00 2.00 553 553 // Formant 2
ramp 1 5 72.00 2.00 2420 2420 // Formant 3
note 125 1 101 0 72.00 1.90 1 0.03 // Buzz1
note 126 1 119 0 72.00 1.90 0.1 // RMS1
note 127 1 121 0 72.00 1.90 // Mouth1
note 128 1 199 0 72.00 1.90 // Rebalance1
// look
ramp 1 2 74.00 1.00 246.9 246.9 // B3
ramp 1 1 74.00 1.00 6000 6000 // Amplitude
ramp 1 3 74.00 1.00 344 344 // Formant 1
ramp 1 4 74.00 1.00 635 635 // Formant 2
ramp 1 5 74.00 1.00 2413 2413 // Formant 3
note 129 1 101 0 74.00 0.90 1 0.03 // Buzz1
note 130 1 119 0 74.00 0.90 0.1 // RMS1
note 131 1 121 0 74.00 0.90 // Mouth1
note 132 1 199 0 74.00 0.90 // Rebalance1
// sweet
ramp 1 2 75.00 2.00 220 220 // A3
ramp 1 1 75.00 2.00 6000 6000 // Amplitude
ramp 1 3 75.00 2.00 344 344 // Formant 1
ramp 1 4 75.00 2.00 2170 2170 // Formant 2
ramp 1 5 75.00 2.00 2660 2660 // Formant 3
note 133 1 101 0 75.00 1.90 1 0.03 // Buzz1
note 134 1 119 0 75.00 1.90 0.1 // RMS1
note 135 1 121 0 75.00 1.90 // Mouth1
note 136 1 199 0 75.00 1.90 // Rebalance1
// u-
ramp 1 2 77.00 0.98 146.8 146.8 // D3
ramp 1 2 77.98 0.02 146.8 196 // D3-G3
ramp 1 1 77.00 0.90 6000 6000 // Amplitude
ramp 1 3 77.00 0.90 414 414 // Formant 1
ramp 1 4 77.00 0.90 1516 1516 // Formant 2
ramp 1 5 77.00 0.90 2500 2500 // Formant 3
ramp 1 1 77.90 0.10 6000 6000 // Amplitude
ramp 1 3 77.90 0.10 414 565 // Formant 1
ramp 1 4 77.90 0.10 1516 915 // Formant 2
ramp 1 5 77.90 0.10 2500 2373 // Formant 3
note 137 1 101 0 77.00 1.00 1 0.03 // Buzz1
// -pon
ramp 1 2 78.00 2.00 196 196 // G3
ramp 1 1 78.00 2.00 6000 6000 // Amplitude
ramp 1 3 78.00 2.00 565 565 // Formant 1
ramp 1 4 78.00 2.00 915 915 // Formant 2
ramp 1 5 78.00 2.00 2373 2373 // Formant 3
note 138 1 101 137 78.00 1.90 1 0 // Buzz1
note 139 1 119 0 77.00 2.90 // RMS1
note 140 1 121 0 77.00 2.90 // Mouth1
note 141 1 199 0 77.00 2.90 // Rebalance1
// the
ramp 1 2 80.00 1.00 246.9 246.9 // B3
ramp 1 1 80.00 1.00 6000 6000 // Amplitude
ramp 1 3 80.00 1.00 414 414 // Formant 1
ramp 1 4 80.00 1.00 1516 1516 // Formant 2
ramp 1 5 80.00 1.00 2500 2500 // Formant 3
note 142 1 101 0 80.00 0.90 1 0.03 // Buzz1
note 143 1 119 0 80.00 0.90 0.1 // RMS1
note 144 1 121 0 80.00 0.90 // Mouth1
note 145 1 199 0 80.00 0.90 // Rebalance1
// seat
ramp 1 2 81.00 1.00 220 220 // A3
ramp 1 1 81.00 1.00 6000 6000 // Amplitude
ramp 1 3 81.00 1.00 344 344 // Formant 1
ramp 1 4 81.00 1.00 2170 2170 // Formant 2
ramp 1 5 81.00 1.00 2660 2660 // Formant 3
note 146 1 101 0 81.00 0.90 1 0.03 // Buzz1
note 147 1 119 0 81.00 0.90 0.1 // RMS1
note 148 1 121 0 81.00 0.90 // Mouth1
note 149 1 199 0 81.00 0.90 // Rebalance1
// of
ramp 1 2 82.00 1.00 246.9 246.9 // B3
ramp 1 1 82.00 1.00 6000 6000 // Amplitude
ramp 1 3 82.00 1.00 414 414 // Formant 1
ramp 1 4 82.00 1.00 1516 1516 // Formant 2
ramp 1 5 82.00 1.00 2500 2500 // Formant 3
note 150 1 101 0 82.00 0.90 1 0.03 // Buzz1
note 151 1 119 0 82.00 0.90 0.1 // RMS1
note 152 1 121 0 82.00 0.90 // Mouth1
note 153 1 199 0 82.00 0.90 // Rebalance1
// a
ramp 1 2 83.00 1.00 261.6 261.6 // C4
ramp 1 1 83.00 1.00 6000 6000 // Amplitude
ramp 1 3 83.00 1.00 414 414 // Formant 1
ramp 1 4 83.00 1.00 1516 1516 // Formant 2
ramp 1 5 83.00 1.00 2500 2500 // Formant 3
note 154 1 101 0 83.00 0.90 1 0.03 // Buzz1
note 155 1 119 0 83.00 0.90 0.1 // RMS1
note 156 1 121 0 83.00 0.90 // Mouth1
note 157 1 199 0 83.00 0.90 // Rebalance1
// bi-
ramp 1 2 84.00 0.98 293.7 293.7 // D4
ramp 1 2 84.98 0.02 293.7 246.9 // D4-B3
ramp 1 1 84.00 0.30 6000 6000 // Amplitude
ramp 1 3 84.00 0.30 800 800 // Formant 1
ramp 1 4 84.00 0.30 1228 1228 // Formant 2
ramp 1 5 84.00 0.30 2500 2500 // Formant 3
ramp 1 1 84.30 0.30 6000 6000 // Amplitude
ramp 1 3 84.30 0.30 800 344 // Formant 1
ramp 1 4 84.30 0.30 1228 2170 // Formant 2
ramp 1 5 84.30 0.30 2500 2660 // Formant 3
ramp 1 1 84.60 0.40 6000 6000 // Amplitude
ramp 1 3 84.60 0.40 344 344 // Formant 1
ramp 1 4 84.60 0.40 2170 2170 // Formant 2
ramp 1 5 84.60 0.40 2660 2660 // Formant 3
note 158 1 101 0 84.00 1.00 1 0.03 // Buzz1
// -cy-
ramp 1 2 85.00 0.98 246.9 246.9 // B3
ramp 1 2 85.98 0.02 246.9 196 // B3-G3
ramp 1 1 85.00 0.95 6000 6000 // Amplitude
ramp 1 3 85.00 0.95 344 344 // Formant 1
ramp 1 4 85.00 0.95 2170 2170 // Formant 2
ramp 1 5 85.00 0.95 2660 2660 // Formant 3
ramp 1 1 85.95 0.05 6000 6000 // Amplitude
ramp 1 3 85.95 0.05 344 414 // Formant 1
ramp 1 4 85.95 0.05 2170 1516 // Formant 2
ramp 1 5 85.95 0.05 2660 2500 // Formant 3
note 159 1 101 158 85.00 1.00 1 0 // Buzz1
// -cle
ramp 1 2 86.00 1.00 196 196 // G3
ramp 1 1 86.00 1.00 6000 6000 // Amplitude
ramp 1 3 86.00 1.00 414 414 // Formant 1
ramp 1 4 86.00 1.00 1516 1516 // Formant 2
ramp 1 5 86.00 1.00 2500 2500 // Formant 3
note 160 1 101 159 86.00 0.90 1 0 // Buzz1
note 161 1 119 0 84.00 2.90 0.1 // RMS1
note 162 1 121 0 84.00 2.90 // Mouth1
note 163 1 199 0 84.00 2.90 // Rebalance1
// built
ramp 1 2 87.00 2.00 220 220 // A3
ramp 1 1 87.00 2.00 6000 6000 // Amplitude
ramp 1 3 87.00 2.00 344 344 // Formant 1
ramp 1 4 87.00 2.00 2170 2170 // Formant 2
ramp 1 5 87.00 2.00 2660 2660 // Formant 3
note 164 1 101 0 87.00 1.90 1 0.03 // Buzz1
note 165 1 119 0 87.00 1.90 0.1 // RMS1
note 166 1 121 0 87.00 1.90 // Mouth1
note 167 1 199 0 87.00 1.90 // Rebalance1
// for
ramp 1 2 89.00 1.00 146.8 146.8 // D3
ramp 1 1 89.00 1.00 6000 6000 // Amplitude
ramp 1 3 89.00 1.00 414 414 // Formant 1
ramp 1 4 89.00 1.00 1516 1516 // Formant 2
ramp 1 5 89.00 1.00 2500 2500 // Formant 3
note 168 1 101 0 89.00 0.90 1 0.03 // Buzz1
note 169 1 119 0 89.00 0.90 0.1 // RMS1
note 170 1 121 0 89.00 0.90 // Mouth1
note 171 1 199 0 89.00 0.90 // Rebalance1
// two
ramp 1 2 90.00 6.00 196 196 // G3
ramp 1 1 90.00 6.00 6000 6000 // Amplitude
ramp 1 3 90.00 6.00 277 277 // Formant 1
ramp 1 4 90.00 6.00 553 553 // Formant 2
ramp 1 5 90.00 6.00 2420 2420 // Formant 3
note 172 1 101 0 90.00 3.00 1 0.03 // Buzz1
note 173 1 119 0 90.00 3.00 0.1 // RMS1
note 174 1 121 0 90.00 3.00 // Mouth1
note 175 1 199 0 90.00 3.00 // Rebalance1
// (rest)
end 96.0
Listing 4: Notelist for “Daisy Bell”, Iteration #2. Vowels and diphthongs are highlighted in
green.
To hear a realization, click
here.
“Daisy Bell” Iteration #2
Listing 4 presents the second-iteration synthesis of “Daisy Bell”.
Notice that the notes are exactly the same as in Listing 2, as
are the ramp
statements for the “Frequency”
contour (contour #2 of voice #1). Likewise the “Amplitude” contour (contour #1 of voice #1) behaves
as before, holding constant at 6000 throughout the duration of the song.
What's new here is the active role of the three vocal-resonance contours, “Formant 1” (contour #3
of voice #1), “Formant 2” (contour #4 of voice #1), and “Formant 3” contour
(contour #5 of voice #1).
These indications are color-coded in green.
Notice that all formant data passes to the instruments via ramp
statements rather than via note
parameters.
To encode the formant data for a simple vowel requires a block of three ramp
statements
all occupying the same time segment (that is, all sharing the same start time and duration). See, for example
the three ramps starting at time 3.00 and holding for 2 seconds. These produce the vowel i
, which
Listing 3 specifies as the appropriate vowel for the second syllable
of “daisy”.
To encode the formant data for a diphthong requires three blocks, each encompassing three ramp
statements.
See, for example, the three blocks starting at time 0.00. The first block starts at time 0.00 and holds for 200 msec.;
the second block starts at time 0.20 and transitions for 300 msec.; the third block starts at time 0.50 and holds for 2.5 seconds.
These three blocks produce the diphthong eɪ
, which
Listing 3 specifies for the first syllable of “daisy”.
The outer blocks are steady-state, while the inner block effects the diphthong proper.
Next topic: Glides and Liquids
© Charles Ames |
Page created: 2014-02-20 |
Last updated: 2017-06-12 |