Speech Synthesis:
Vowels and Diphthongs

Introduction

Speech synthesis employs a source-filter model of production. Although originally applied to speech by Gunnar Fant, the source-filter model extends to sound production in general. It consists of two phases:

For background reading on vowels I direct you to James Kirby's PowerPoint, Phonetic analysis of vowels.

Orchestra

    
Figure 1 (a): Instrument #101: Buzz1 realizes the excitation phase for vowels, diphthongs, glides, and liquids.      Figure 1 (b): Instrument #119: RMS1 captures the power envelope from Buzz1 output.
    
Figure 1 (c): Instrument #121: Mouth1 realizes the resonance phase for vowels, diphthongs, glides, and liquids.      Figure 1 (d): Instrument #199: Rebalance1 restores the power envelope from Buzz1 to the audio output from Mouth1 and buffers the result out to file.

A quartet of instruments is sufficient to synthesize vocal sounds not just for the vowels and diphthongs explored on the present page, but also for the glides and liquids explored in the next iteration. This quartet of instruments consists of:

Issues entailed during application of filtering to vowel sounds are worked through in Emulating Oral Resonances, and the insights gained there have been incorporated into Instrument #101: Mouth1. The main thing to understand about Instrument #101: Mouth1 is that it employs three band-pass resonators to produce low (Unit #3), middle (Unit #4), and high (Unit #5) formant peaks. Furthermore,

(The correspondance between unit numbers and contour numbers is coincidental.)

IPA Description Examples F1F2F3
i close front unrounded vowel (long E) see, heat 27722083079
y close front rounded yes 27719372232
ɨ/ʉ close central rounded 27715202310
ɯ close back unrounded 27712182500
V1 2278452460
u close back rounded (long OO) boot 2775532420
ɪ near-close near-front unrounded (short I) hit, sitting 34421702660
ʏ near-close near-front rounded 34417702230
ɨ† close central unrounded 34415072390
ɯ† close back unrounded 34412282500
ʊ near-close near-back (short OO) put, could 3446352413
e close-mid front unrounded met, bed 41420652570
ø close-mid front rounded 41416082250
ə mid central (schwa; short U) 41415162500
ɤ close-mid back unrounded book 41412382500
o close-mid back rounded (long O) tow 4147212406
'e' 48719282580
'ə'/'θ' 48714922505
'ɣ' 48712482500
V2 4878452460
'o' 4878152393
ε open-mid front unrounded (short E) pet 56518192528
œ open-mid front rounded 56515202500
ɜ open-mid central unrounded 56514622500
ʌ open-mid back unrounded (short U) cup, luck 56512582500
ɔ open-mid back rounded pot 5659152373
æ near-open front unrounded (short A) cat, black 64817122490
ɐ near-open central cut 64814052500
ɒ† 64810232500
a† 73514982537
ɑ open back unrounded (short O) hot, rock 73512782500
ɒ open back rounded 73511412280
a open front unrounded arm, father 80012282500

Table 1: Vowel formants (after Boë, Valleé, Schwartz and Abry, 2002). Vowels V1 and V2 are unrecognized by the International Phonetic Alphabet (IPA); they fill in the “hole [between] unrounded and rounded back vowels”. I have no idea why some IPA codes are enclosed in single quotes. Descriptions from Wikipedia and pronunciationcoach.wordpress.com. Pronunciation examples from webdesign.about.com and antimoon.com.

Vowels

After searching the internet for tables of formant frequencies, I settled on adapting the one by Boë, Valleé, Schwartz and Abry presented in Table 1. This table is particularly suitable for the current purpose because it fills in “holes” not officially recognized by the IPA. Such unofficial values might be regarded as a fault to a phoneticist, but our purpose here is to explore the diversity of vocal sounds.

Understand that the formant values presented in Table 1 represent a male speaker; the values will be proportionately higher for women and children. Table II on page 183 of Peterson and Barney compares vowel formants for men, women, and children. I analyzed Peterson and Barney's data and obtained an average frequency shift of 116% between formant values for women (numerator) and corresponding formant values for men (denominator). For children the average frequency shift was around 135%.

The IPA chart reproduced in Figure 2 places vowels in context:


Figure 2: International Phonetic Association chart of vowels.

Figure 3 (a): Frequency response for vowel a (as in father) with formants at 800, 1228, and 2500 Hz.
Figure 3 (b): Frequency response for vowel æ (as in cat) with formants at 648, 1712, and 2490 Hz.
Figure 3 (c): Frequency response for vowel i (as in see) with formants at 277, 2208, and 3079 Hz.
Figure 3 (d): Frequency response for vowel o (as in tow) with formants at 414, 721, and 2406 Hz.
Figure 3 (e): Frequency response for vowel u (as in boot) with formants at 277, 553, and 2420 Hz.

The frequency-response graphs in Figures 3 (a), (b), (c), (d), and (e) where created by sweeping a sine tone linearly from 110 to 3500 Hz. and processing the result through three cascaded resonators using the indicated formant peaks. A power envelope was then extracted using the RMS unit and the logarithm of the result was plotted.

IPADescriptionExample
long Acake
long Ikite
long Orope
julong Ucue
æʊOU (AW)house
ɔʊOYboy

Table 2: Familiar English diphthongs.

Diphthongs

Diphthongs happen when two vowels occur consecutively in a word. The transition from the source vowel to the target vowel is gradual, taking around 300 msec. One site, pronunciationcoach.wordpress.com takes the position that all sounds designated in English as “long vowels” are actually diphthongs. On that basis, the same site contends that two additional diphthongs, AW and OY, should additionally be recognized as vowels. Table 2 transcribes these six sounds (four long vowels plus AW and OY) into the International Phonetic Alphabet, care of learn-foreign-language-phonetics.com. Spectrograms of diphthongs in progress are provided by James Kirby's PowerPoint, “Spectral features of vowels; spectrograms”.

orch /Users/charlesames/Scratch/SpeechOrch.xml
orch /Users/charlesames/Documents/Sound/SpeechOrch.xml
set rate 44100
set bits 16
set norm 1
name Daisy2

// Dai-
ramp 1 1 0.00 0.20 6000 6000 // Amplitude
ramp 1 3 0.00 0.20 414 414 // Formant 1
ramp 1 4 0.00 0.20 2065 2065 // Formant 2
ramp 1 5 0.00 0.20 2570 2570 // Formant 3
ramp 1 1 0.20 0.30 6000 6000 // Amplitude
ramp 1 3 0.20 0.30 414 344 // Formant 1
ramp 1 4 0.20 0.30 2065 2170 // Formant 2
ramp 1 5 0.20 0.30 2570 2660 // Formant 3
ramp 1 1 0.50 2.50 6000 6000 // Amplitude
ramp 1 3 0.50 2.50 344 344 // Formant 1
ramp 1 4 0.50 2.50 2170 2170 // Formant 2
ramp 1 5 0.50 2.50 2660 2660 // Formant 3
ramp 1 2 0.00 2.98 293.7 293.7 // D4
ramp 1 2 2.98 0.02 293.7 246.9 // D4-B3
note 1 1 101 0 0.00 3.00 1 0.03 // Buzz1
// -sy
ramp 1 2 3.00 3.00 246.9 246.9 // B3
ramp 1 1 3.00 2.00 6000 6000 // Amplitude
ramp 1 3 3.00 2.00 277 277 // Formant 1
ramp 1 4 3.00 2.00 2208 2208 // Formant 2
ramp 1 5 3.00 2.00 3079 3079 // Formant 3
note 2 1 101 1 3.00 2.00 1 0 // Buzz1
note 3 1 119 0 0.00 5.00 0.1 // RMS1
note 4 1 121 0 0.00 5.00 // Mouth1
note 5 1 199 0 0.00 5.00 // Rebalance1

// (rest)
ramp 1 1 5.00 1.00 6000 6000 // Amplitude
ramp 1 3 5.00 1.00 277 414 // Formant 1
ramp 1 4 5.00 1.00 2208 2065 // Formant 2
ramp 1 5 5.00 1.00 3079 2570 // Formant 3

// Dai-
ramp 1 2 6.00 2.98 196 196 // G3
ramp 1 2 8.98 0.02 196 146.8 // G3-D3
ramp 1 1 6.00 0.20 6000 6000 // Amplitude
ramp 1 3 6.00 0.20 414 414 // Formant 1
ramp 1 4 6.00 0.20 2065 2065 // Formant 2
ramp 1 5 6.00 0.20 2570 2570 // Formant 3
ramp 1 1 6.20 0.30 6000 6000 // Amplitude
ramp 1 3 6.20 0.30 414 344 // Formant 1
ramp 1 4 6.20 0.30 2065 2170 // Formant 2
ramp 1 5 6.20 0.30 2570 2660 // Formant 3
ramp 1 1 6.50 2.50 6000 6000 // Amplitude
ramp 1 3 6.50 2.50 344 344 // Formant 1
ramp 1 4 6.50 2.50 2170 2170 // Formant 2
ramp 1 5 6.50 2.50 2660 2660 // Formant 3
note 6 1 101 0 6.00 3.00 1 0.03 // Buzz1
// -sy
ramp 1 2 9.00 3.00 146.8 146.8 // D3
ramp 1 1 9.00 2.00 6000 6000 // Amplitude
ramp 1 3 9.00 2.00 277 277 // Formant 1
ramp 1 4 9.00 2.00 2208 2208 // Formant 2
ramp 1 5 9.00 2.00 3079 3079 // Formant 3
note 7 1 101 6 9.00 2.00 1 0 // Buzz1
note 8 1 119 0 6.00 5.00 0.1 // RMS1
note 9 1 121 0 6.00 5.00 // Mouth1
note 10 1 199 0 6.00 5.00 // Rebalance1

// (rest)
ramp 1 1 11.00 1.00 6000 6000 // Amplitude
ramp 1 3 11.00 1.00 277 344 // Formant 1
ramp 1 4 11.00 1.00 2208 2170 // Formant 2
ramp 1 5 11.00 1.00 3079 2660 // Formant 3

// Give
ramp 1 2 12.00 1.00 164.8 164.8 // E3
ramp 1 1 12.00 1.00 6000 6000 // Amplitude
ramp 1 3 12.00 1.00 344 344 // Formant 1
ramp 1 4 12.00 1.00 2170 2170 // Formant 2
ramp 1 5 12.00 1.00 2660 2660 // Formant 3
note 11 1 101 0 12.00 0.90 1 0.03 // Buzz1
note 12 1 119 0 12.00 0.90 0.1 // RMS1
note 13 1 121 0 12.00 0.90 // Mouth1
note 14 1 199 0 12.00 0.90 // Rebalance1

// me
ramp 1 2 13.00 1.00 185 185 // F#3
ramp 1 1 13.00 1.00 6000 6000 // Amplitude
ramp 1 3 13.00 1.00 277 277 // Formant 1
ramp 1 4 13.00 1.00 2208 2208 // Formant 2
ramp 1 5 13.00 1.00 3079 3079 // Formant 3
note 15 1 101 0 13.00 0.90 1 0.03 // Buzz1
note 16 1 119 0 13.00 0.90 // RMS1
note 17 1 121 0 13.00 0.90 // Mouth1
note 18 1 199 0 13.00 0.90 // Rebalance1

// your
ramp 1 2 14.00 1.00 196 196 // G3
ramp 1 1 14.00 1.00 6000 6000 // Amplitude
ramp 1 3 14.00 1.00 414 414 // Formant 1
ramp 1 4 14.00 1.00 1516 1516 // Formant 2
ramp 1 5 14.00 1.00 2500 2500 // Formant 3
note 19 1 101 0 14.00 0.90 1 0.03 // Buzz1
note 20 1 119 0 14.00 0.90 0.1 // RMS1
note 21 1 121 0 14.00 0.90 // Mouth1
note 22 1 199 0 14.00 0.90 // Rebalance1

// an-
ramp 1 2 15.00 1.98 164.8 164.8 // E3
ramp 1 2 16.98 0.02 164.8 196 // E3-G3
ramp 1 1 15.00 2.00 6000 6000 // Amplitude
ramp 1 3 15.00 2.00 648 648 // Formant 1
ramp 1 4 15.00 2.00 1712 1712 // Formant 2
ramp 1 5 15.00 2.00 2490 2490 // Formant 3
note 23 1 101 0 15.00 2.00 1 0.03 // Buzz1
// -swer
ramp 1 2 17.00 1.00 196 196 // G3
ramp 1 1 17.00 1.00 6000 6000 // Amplitude
ramp 1 3 17.00 1.00 414 414 // Formant 1
ramp 1 4 17.00 1.00 1516 1516 // Formant 2
ramp 1 5 17.00 1.00 2500 2500 // Formant 3
note 24 1 101 23 17.00 0.90 1 0 // Buzz1
note 25 1 119 0 15.00 2.90 // RMS1
note 26 1 121 0 15.00 2.90 // Mouth1
note 27 1 199 0 15.00 2.90 // Rebalance1

// do
ramp 1 2 18.00 6.00 146.8 146.8 // D3
ramp 1 1 18.00 3.00 6000 6000 // Amplitude
ramp 1 3 18.00 3.00 277 277 // Formant 1
ramp 1 4 18.00 3.00 553 553 // Formant 2
ramp 1 5 18.00 3.00 2420 2420 // Formant 3
note 28 1 101 0 18.00 3.00 1 0.03 // Buzz1
note 29 1 119 0 18.00 3.00 0.1 // RMS1
note 30 1 121 0 18.00 3.00 // Mouth1
note 31 1 199 0 18.00 3.00 // Rebalance1

// (rest)
ramp 1 1 21.00 3.00 6000 6000 // Amplitude
ramp 1 3 21.00 3.00 277 800 // Formant 1
ramp 1 4 21.00 3.00 553 1228 // Formant 2
ramp 1 5 21.00 3.00 2420 2500 // Formant 3

// I'm
ramp 1 2 24.00 3.00 220 220 // A3
ramp 1 1 24.00 0.30 6000 6000 // Amplitude
ramp 1 3 24.00 0.30 800 800 // Formant 1
ramp 1 4 24.00 0.30 1228 1228 // Formant 2
ramp 1 5 24.00 0.30 2500 2500 // Formant 3
ramp 1 1 24.30 0.30 6000 6000 // Amplitude
ramp 1 3 24.30 0.30 800 344 // Formant 1
ramp 1 4 24.30 0.30 1228 2170 // Formant 2
ramp 1 5 24.30 0.30 2500 2660 // Formant 3
ramp 1 1 24.60 2.40 6000 6000 // Amplitude
ramp 1 3 24.60 2.40 344 344 // Formant 1
ramp 1 4 24.60 2.40 2170 2170 // Formant 2
ramp 1 5 24.60 2.40 2660 2660 // Formant 3
note 32 1 101 0 24.00 2.90 1 0.03 // Buzz1
note 33 1 119 0 24.00 2.90 // RMS1
note 34 1 121 0 24.00 2.90 // Mouth1
note 35 1 199 0 24.00 2.90 // Rebalance1

// half
ramp 1 2 27.00 3.00 293.7 293.7 // D4
ramp 1 1 27.00 3.00 6000 6000 // Amplitude
ramp 1 3 27.00 3.00 800 800 // Formant 1
ramp 1 4 27.00 3.00 1228 1228 // Formant 2
ramp 1 5 27.00 3.00 2500 2500 // Formant 3
note 36 1 101 0 27.00 3.00 1 0.03 // Buzz1
note 37 1 119 0 27.00 3.00 0.1 // RMS1
note 38 1 121 0 27.00 3.00 // Mouth1
note 39 1 199 0 27.00 3.00 // Rebalance1

// cra-
ramp 1 2 30.00 1.90 246.9 246.9 // B3
ramp 1 2 31.90 0.10 246.9 220 // B3-A3
ramp 1 2 32.00 0.90 220 220 // A3
ramp 1 2 32.90 0.10 220 196 // A3-G3
ramp 1 1 30.00 1.85 6000 6000 // Amplitude
ramp 1 3 30.00 1.85 414 414 // Formant 1
ramp 1 4 30.00 1.85 2065 2065 // Formant 2
ramp 1 5 30.00 1.85 2570 2570 // Formant 3
ramp 1 1 31.85 0.15 6000 6000 // Amplitude
ramp 1 3 31.85 0.15 414 344 // Formant 1
ramp 1 4 31.85 0.15 2065 2170 // Formant 2
ramp 1 5 31.85 0.15 2570 2660 // Formant 3
ramp 1 1 32.00 0.85 6000 6000 // Amplitude
ramp 1 3 32.00 0.85 344 344 // Formant 1
ramp 1 4 32.00 0.85 2170 2170 // Formant 2
ramp 1 5 32.00 0.85 2660 2660 // Formant 3
ramp 1 1 32.85 0.15 6000 6000 // Amplitude
ramp 1 3 32.85 0.15 344 277 // Formant 1
ramp 1 4 32.85 0.15 2170 2208 // Formant 2
ramp 1 5 32.85 0.15 2660 3079 // Formant 3
note 40 1 101 0 30.00 3.00 1 0.03 // Buzz1
// -zy
ramp 1 2 33.00 2.00 196 196 // G3
ramp 1 1 33.00 2.00 6000 6000 // Amplitude
ramp 1 3 33.00 2.00 277 277 // Formant 1
ramp 1 4 33.00 2.00 2208 2208 // Formant 2
ramp 1 5 33.00 2.00 3079 3079 // Formant 3
note 41 1 101 40 33.00 1.90 1 0 // Buzz1
note 42 1 119 0 30.00 4.90 0.1 // RMS1
note 43 1 121 0 30.00 4.90 // Mouth1
note 44 1 199 0 30.00 4.90 // Rebalance1

// and
ramp 1 2 35.00 1.00 185 185 // F#3
ramp 1 1 35.00 1.00 6000 6000 // Amplitude
ramp 1 3 35.00 1.00 414 414 // Formant 1
ramp 1 4 35.00 1.00 1516 1516 // Formant 2
ramp 1 5 35.00 1.00 2500 2500 // Formant 3
note 45 1 101 0 35.00 0.90 1 0.03 // Buzz1
note 46 1 119 0 35.00 0.90 // RMS1
note 47 1 121 0 35.00 0.90 // Mouth1
note 48 1 199 0 35.00 0.90 // Rebalance1

// all
ramp 1 2 36.00 1.00 164.8 164.8 // E3
ramp 1 1 36.00 1.00 6000 6000 // Amplitude
ramp 1 3 36.00 1.00 800 800 // Formant 1
ramp 1 4 36.00 1.00 1228 1228 // Formant 2
ramp 1 5 36.00 1.00 2500 2500 // Formant 3
note 49 1 101 0 36.00 0.90 1 0.03 // Buzz1
note 50 1 119 0 36.00 0.90 0.1 // RMS1
note 51 1 121 0 36.00 0.90 // Mouth1
note 52 1 199 0 36.00 0.90 // Rebalance1

// for
ramp 1 2 37.00 1.00 185 185 // F#3
ramp 1 1 37.00 1.00 6000 6000 // Amplitude
ramp 1 3 37.00 1.00 414 414 // Formant 1
ramp 1 4 37.00 1.00 721 721 // Formant 2
ramp 1 5 37.00 1.00 2406 2406 // Formant 3
note 53 1 101 0 37.00 0.90 1 0.1 // Buzz1
note 54 1 119 0 37.00 0.90 0.1 // RMS1
note 55 1 121 0 37.00 0.90 // Mouth1
note 56 1 199 0 37.00 0.90 // Rebalance1

// the
ramp 1 2 38.00 1.00 196 196 // G3
ramp 1 1 38.00 1.00 6000 6000 // Amplitude
ramp 1 3 38.00 1.00 414 414 // Formant 1
ramp 1 4 38.00 1.00 1516 1516 // Formant 2
ramp 1 5 38.00 1.00 2500 2500 // Formant 3
note 57 1 101 0 38.00 0.90 1 0.03 // Buzz1
note 58 1 119 0 38.00 0.90 0.1 // RMS1
note 59 1 121 0 38.00 0.90 // Mouth1
note 60 1 199 0 38.00 0.90 // Rebalance1

// love
ramp 1 2 39.00 2.00 220 220 // A3
ramp 1 1 39.00 2.00 6000 6000 // Amplitude
ramp 1 3 39.00 2.00 414 414 // Formant 1
ramp 1 4 39.00 2.00 1516 1516 // Formant 2
ramp 1 5 39.00 2.00 2500 2500 // Formant 3
note 61 1 101 0 39.00 1.90 1 0.03 // Buzz1
note 62 1 119 0 39.00 1.90 0.1 // RMS1
note 63 1 121 0 39.00 1.90 // Mouth1
note 64 1 199 0 39.00 1.90 // Rebalance1

// of
ramp 1 2 41.00 1.00 246.9 246.9 // B3
ramp 1 1 41.00 0.45 6000 6000 // Amplitude
ramp 1 3 41.00 0.45 414 414 // Formant 1
ramp 1 4 41.00 0.45 1516 1516 // Formant 2
ramp 1 5 41.00 0.45 2500 2500 // Formant 3
ramp 1 1 41.45 0.05 6000 6000 // Amplitude
ramp 1 3 41.45 0.05 414 414 // Formant 1
ramp 1 4 41.45 0.05 1516 721 // Formant 2
ramp 1 5 41.45 0.05 2500 2406 // Formant 3
ramp 1 1 41.50 0.50 6000 6000 // Amplitude
ramp 1 3 41.50 0.50 414 414 // Formant 1
ramp 1 4 41.50 0.50 721 721 // Formant 2
ramp 1 5 41.50 0.50 2406 2406 // Formant 3
note 65 1 101 0 41.00 0.90 1 0.03 // Buzz1
note 66 1 119 0 41.00 0.90 0.1 // RMS1
note 67 1 121 0 41.00 0.90 // Mouth1
note 68 1 199 0 41.00 0.90 // Rebalance1

// you
ramp 1 2 42.00 5.00 220 220 // A3
ramp 1 1 42.00 3.00 6000 6000 // Amplitude
ramp 1 3 42.00 3.00 277 277 // Formant 1
ramp 1 4 42.00 3.00 553 553 // Formant 2
ramp 1 5 42.00 3.00 2420 2420 // Formant 3
note 69 1 101 0 42.00 2.90 1 0.03 // Buzz1
note 70 1 119 0 42.00 2.90 0.1 // RMS1
note 71 1 121 0 42.00 2.90 // Mouth1
note 72 1 199 0 42.00 2.90 // Rebalance1

// (rest)
ramp 1 1 45.00 2.00 6000 6000 // Amplitude
ramp 1 3 45.00 2.00 277 344 // Formant 1
ramp 1 4 45.00 2.00 553 2170 // Formant 2
ramp 1 5 45.00 2.00 2420 2660 // Formant 3

// It
ramp 1 2 47.00 1.00 246.9 246.9 // B3
ramp 1 1 47.00 1.00 6000 6000 // Amplitude
ramp 1 3 47.00 1.00 344 344 // Formant 1
ramp 1 4 47.00 1.00 2170 2170 // Formant 2
ramp 1 5 47.00 1.00 2660 2660 // Formant 3
note 73 1 101 0 47.00 0.50 1 0.03 // Buzz1
note 74 1 119 0 47.00 0.90 0.1 // RMS1
note 75 1 121 0 47.00 0.90 // Mouth1
note 76 1 199 0 47.00 0.90 // Rebalance1

// won't
ramp 1 2 48.00 1.00 261.6 261.6 // C4
ramp 1 1 48.00 0.20 6000 6000 // Amplitude
ramp 1 3 48.00 0.20 414 414 // Formant 1
ramp 1 4 48.00 0.20 721 721 // Formant 2
ramp 1 5 48.00 0.20 2406 2406 // Formant 3
ramp 1 1 48.20 0.30 6000 6000 // Amplitude
ramp 1 3 48.20 0.30 414 344 // Formant 1
ramp 1 4 48.20 0.30 721 635 // Formant 2
ramp 1 5 48.20 0.30 2406 2413 // Formant 3
ramp 1 1 48.50 0.50 6000 6000 // Amplitude
ramp 1 3 48.50 0.50 344 344 // Formant 1
ramp 1 4 48.50 0.50 635 635 // Formant 2
ramp 1 5 48.50 0.50 2413 2413 // Formant 3
note 77 1 101 0 48.00 0.90 1 0.03 // Buzz1
note 78 1 119 0 48.00 0.90 // RMS1
note 79 1 121 0 48.00 0.90 // Mouth1
note 80 1 199 0 48.00 0.90 // Rebalance1

// be
ramp 1 2 49.00 1.00 246.9 246.9 // B3
ramp 1 1 49.00 1.00 6000 6000 // Amplitude
ramp 1 3 49.00 1.00 344 344 // Formant 1
ramp 1 4 49.00 1.00 2170 2170 // Formant 2
ramp 1 5 49.00 1.00 2660 2660 // Formant 3
note 81 1 101 0 49.00 0.90 1 0.03 // Buzz1
note 82 1 119 0 49.00 0.90 0.1 // RMS1
note 83 1 121 0 49.00 0.90 // Mouth1
note 84 1 199 0 49.00 0.90 // Rebalance1

// a
ramp 1 2 50.00 1.00 220 220 // A4
ramp 1 1 50.00 1.00 6000 6000 // Amplitude
ramp 1 3 50.00 1.00 414 414 // Formant 1
ramp 1 4 50.00 1.00 1516 1516 // Formant 2
ramp 1 5 50.00 1.00 2500 2500 // Formant 3
note 85 1 101 0 50.00 0.90 1 0.03 // Buzz1
note 86 1 119 0 50.00 0.90 0.1 // RMS1
note 87 1 121 0 50.00 0.90 // Mouth1
note 88 1 199 0 50.00 0.90 // Rebalance1

// sty-
ramp 1 2 51.00 1.98 293.7 293.7 // D4
ramp 1 2 52.98 0.02 293.7 246.9 // D4-B3
ramp 1 1 51.00 0.30 6000 6000 // Amplitude
ramp 1 3 51.00 0.30 800 800 // Formant 1
ramp 1 4 51.00 0.30 1228 1228 // Formant 2
ramp 1 5 51.00 0.30 2500 2500 // Formant 3
ramp 1 1 51.30 0.30 6000 6000 // Amplitude
ramp 1 3 51.30 0.30 800 277 // Formant 1
ramp 1 4 51.30 0.30 1228 2208 // Formant 2
ramp 1 5 51.30 0.30 2500 3079 // Formant 3
ramp 1 1 51.60 1.40 6000 6000 // Amplitude
ramp 1 3 51.60 1.40 277 277 // Formant 1
ramp 1 4 51.60 1.40 2208 2208 // Formant 2
ramp 1 5 51.60 1.40 3079 3079 // Formant 3
note 89 1 101 0 51.00 2.00 1 0.03 // Buzz1
// -lish
ramp 1 2 53.00 1.00 246.9 246.9 // B3
ramp 1 1 53.00 1.00 6000 6000 // Amplitude
ramp 1 3 53.00 1.00 344 344 // Formant 1
ramp 1 4 53.00 1.00 2170 2170 // Formant 2
ramp 1 5 53.00 1.00 2660 2660 // Formant 3
note 90 1 101 89 53.00 0.90 1 0 // Buzz1
note 91 1 119 0 51.00 2.90 0.1 // RMS1
note 92 1 121 0 51.00 2.90 // Mouth1
note 93 1 199 0 51.00 2.90 // Rebalance1

// mar-
ramp 1 2 54.00 0.98 220 220 // A3
ramp 1 2 54.98 0.02 220 196 // A3-G3
ramp 1 1 54.00 1.00 6000 6000 // Amplitude
ramp 1 3 54.00 1.00 414 414 // Formant 1
ramp 1 4 54.00 1.00 2065 2065 // Formant 2
ramp 1 5 54.00 1.00 2570 2570 // Formant 3
note 94 1 101 0 54.00 1.00 1 0.03 // Buzz1
// -riage
ramp 1 2 55.00 4.00 196 196 // G3
ramp 1 1 55.00 2.00 6000 6000 // Amplitude
ramp 1 3 55.00 2.00 344 344 // Formant 1
ramp 1 4 55.00 2.00 2170 2170 // Formant 2
ramp 1 5 55.00 2.00 2660 2660 // Formant 3
note 95 1 101 94 55.00 1.90 1 0 // Buzz1
note 96 1 119 0 54.00 2.90 // RMS1
note 97 1 121 0 54.00 2.90 // Mouth1
note 98 1 199 0 54.00 2.90 // Rebalance1

// (rest)
ramp 1 1 57.00 2.00 6000 6000 // Amplitude
ramp 1 3 57.00 2.00 344 800 // Formant 1
ramp 1 4 57.00 2.00 2170 1228 // Formant 2
ramp 1 5 57.00 2.00 2660 2500 // Formant 3

// I
ramp 1 2 59.00 1.00 220 220 // A3
ramp 1 1 59.00 0.30 6000 6000 // Amplitude
ramp 1 3 59.00 0.30 800 800 // Formant 1
ramp 1 4 59.00 0.30 1228 1228 // Formant 2
ramp 1 5 59.00 0.30 2500 2500 // Formant 3
ramp 1 1 59.30 0.30 6000 6000 // Amplitude
ramp 1 3 59.30 0.30 800 344 // Formant 1
ramp 1 4 59.30 0.30 1228 2170 // Formant 2
ramp 1 5 59.30 0.30 2500 2660 // Formant 3
ramp 1 1 59.60 0.40 6000 6000 // Amplitude
ramp 1 3 59.60 0.40 344 344 // Formant 1
ramp 1 4 59.60 0.40 2170 2170 // Formant 2
ramp 1 5 59.60 0.40 2660 2660 // Formant 3
note 99 1 101 0 59.00 0.90 1 0.03 // Buzz1
note 100 1 119 0 59.00 0.90 0.1 // RMS1
note 101 1 121 0 59.00 0.90 // Mouth1
note 102 1 199 0 59.00 0.90 // Rebalance1

// can't
ramp 1 2 60.00 2.00 246.9 246.9 // B3
ramp 1 1 60.00 2.00 6000 6000 // Amplitude
ramp 1 3 60.00 2.00 648 648 // Formant 1
ramp 1 4 60.00 2.00 1712 1712 // Formant 2
ramp 1 5 60.00 2.00 2490 2490 // Formant 3
note 103 1 101 0 60.00 1.90 1 0.03 // Buzz1
note 104 1 119 0 60.00 1.90 // RMS1
note 105 1 121 0 60.00 1.90 // Mouth1
note 106 1 199 0 60.00 1.90 // Rebalance1

// af-
ramp 1 2 62.00 0.98 196 196 // G3
ramp 1 2 62.98 0.02 196 164.8 // G3-E3
ramp 1 1 62.00 0.95 6000 6000 // Amplitude
ramp 1 3 62.00 0.95 414 414 // Formant 1
ramp 1 4 62.00 0.95 1516 1516 // Formant 2
ramp 1 5 62.00 0.95 2500 2500 // Formant 3
ramp 1 1 62.95 0.05 6000 6000 // Amplitude
ramp 1 3 62.95 0.05 414 565 // Formant 1
ramp 1 4 62.95 0.05 1516 915 // Formant 2
ramp 1 5 62.95 0.05 2500 2373 // Formant 3
note 107 1 101 0 62.00 1.00 1 0.03 // Buzz1
// -ford
ramp 1 2 63.00 2.00 164.8 164.8 // E3
ramp 1 1 63.00 2.00 6000 6000 // Amplitude
ramp 1 3 63.00 2.00 565 565 // Formant 1
ramp 1 4 63.00 2.00 915 915 // Formant 2
ramp 1 5 63.00 2.00 2373 2373 // Formant 3
note 108 1 101 107 63.00 1.90 1 0 // Buzz1
note 109 1 119 0 62.00 2.90 0.1 // RMS1
note 110 1 121 0 62.00 2.90 // Mouth1
note 111 1 199 0 62.00 2.90 // Rebalance1

// a
ramp 1 2 65.00 1.00 196 196 // G3
ramp 1 1 65.00 1.00 6000 6000 // Amplitude
ramp 1 3 65.00 1.00 414 414 // Formant 1
ramp 1 4 65.00 1.00 1516 1516 // Formant 2
ramp 1 5 65.00 1.00 2500 2500 // Formant 3
note 112 1 101 0 65.00 0.90 1 0.03 // Buzz1
note 113 1 119 0 65.00 0.90 0.1 // RMS1
note 114 1 121 0 65.00 0.90 // Mouth1
note 115 1 199 0 65.00 0.90 // Rebalance1

// car-
ramp 1 2 66.00 0.98 164.8 164.8 // E3
ramp 1 2 66.98 0.02 164.8 146.8 // E3-D3
ramp 1 1 66.00 1.00 6000 6000 // Amplitude
ramp 1 3 66.00 1.00 414 414 // Formant 1
ramp 1 4 66.00 1.00 2065 2065 // Formant 2
ramp 1 5 66.00 1.00 2570 2570 // Formant 3
note 116 1 101 0 66.00 1.00 1 0.03 // Buzz1
// -riage
ramp 1 2 67.00 4.00 146.8 146.8 // D3
ramp 1 1 67.00 2.00 6000 6000 // Amplitude
ramp 1 3 67.00 2.00 344 344 // Formant 1
ramp 1 4 67.00 2.00 2170 2170 // Formant 2
ramp 1 5 67.00 2.00 2660 2660 // Formant 3
note 117 1 101 116 67.00 2.00 1 0 // Buzz1
note 118 1 119 0 66.00 3.00 0.1 // RMS1
note 119 1 121 0 66.00 3.00 // Mouth1
note 120 1 199 0 66.00 3.00 // Rebalance1

// (rest)
ramp 1 1 69.00 2.00 6000 6000 // Amplitude
ramp 1 3 69.00 2.00 344 414 // Formant 1
ramp 1 4 69.00 2.00 2170 1516 // Formant 2
ramp 1 5 69.00 2.00 2660 2500 // Formant 3

// But
ramp 1 2 71.00 1.00 146.8 146.8 // D3
ramp 1 1 71.00 1.00 6000 6000 // Amplitude
ramp 1 3 71.00 1.00 414 414 // Formant 1
ramp 1 4 71.00 1.00 1516 1516 // Formant 2
ramp 1 5 71.00 1.00 2500 2500 // Formant 3
note 121 1 101 0 71.00 0.90 1 0.03 // Buzz1
note 122 1 119 0 71.00 0.90 0.1 // RMS1
note 123 1 121 0 71.00 0.90 // Mouth1
note 124 1 199 0 71.00 0.90 // Rebalance1

// you'll
ramp 1 2 72.00 2.00 196 196 // G3
ramp 1 1 72.00 2.00 6000 6000 // Amplitude
ramp 1 3 72.00 2.00 277 277 // Formant 1
ramp 1 4 72.00 2.00 553 553 // Formant 2
ramp 1 5 72.00 2.00 2420 2420 // Formant 3
note 125 1 101 0 72.00 1.90 1 0.03 // Buzz1
note 126 1 119 0 72.00 1.90 0.1 // RMS1
note 127 1 121 0 72.00 1.90 // Mouth1
note 128 1 199 0 72.00 1.90 // Rebalance1

// look
ramp 1 2 74.00 1.00 246.9 246.9 // B3
ramp 1 1 74.00 1.00 6000 6000 // Amplitude
ramp 1 3 74.00 1.00 344 344 // Formant 1
ramp 1 4 74.00 1.00 635 635 // Formant 2
ramp 1 5 74.00 1.00 2413 2413 // Formant 3
note 129 1 101 0 74.00 0.90 1 0.03 // Buzz1
note 130 1 119 0 74.00 0.90 0.1 // RMS1
note 131 1 121 0 74.00 0.90 // Mouth1
note 132 1 199 0 74.00 0.90 // Rebalance1

// sweet
ramp 1 2 75.00 2.00 220 220 // A3
ramp 1 1 75.00 2.00 6000 6000 // Amplitude
ramp 1 3 75.00 2.00 344 344 // Formant 1
ramp 1 4 75.00 2.00 2170 2170 // Formant 2
ramp 1 5 75.00 2.00 2660 2660 // Formant 3
note 133 1 101 0 75.00 1.90 1 0.03 // Buzz1
note 134 1 119 0 75.00 1.90 0.1 // RMS1
note 135 1 121 0 75.00 1.90 // Mouth1
note 136 1 199 0 75.00 1.90 // Rebalance1

// u-
ramp 1 2 77.00 0.98 146.8 146.8 // D3
ramp 1 2 77.98 0.02 146.8 196 // D3-G3
ramp 1 1 77.00 0.90 6000 6000 // Amplitude
ramp 1 3 77.00 0.90 414 414 // Formant 1
ramp 1 4 77.00 0.90 1516 1516 // Formant 2
ramp 1 5 77.00 0.90 2500 2500 // Formant 3
ramp 1 1 77.90 0.10 6000 6000 // Amplitude
ramp 1 3 77.90 0.10 414 565 // Formant 1
ramp 1 4 77.90 0.10 1516 915 // Formant 2
ramp 1 5 77.90 0.10 2500 2373 // Formant 3
note 137 1 101 0 77.00 1.00 1 0.03 // Buzz1
// -pon
ramp 1 2 78.00 2.00 196 196 // G3
ramp 1 1 78.00 2.00 6000 6000 // Amplitude
ramp 1 3 78.00 2.00 565 565 // Formant 1
ramp 1 4 78.00 2.00 915 915 // Formant 2
ramp 1 5 78.00 2.00 2373 2373 // Formant 3
note 138 1 101 137 78.00 1.90 1 0 // Buzz1
note 139 1 119 0 77.00 2.90 // RMS1
note 140 1 121 0 77.00 2.90 // Mouth1
note 141 1 199 0 77.00 2.90 // Rebalance1

// the
ramp 1 2 80.00 1.00 246.9 246.9 // B3
ramp 1 1 80.00 1.00 6000 6000 // Amplitude
ramp 1 3 80.00 1.00 414 414 // Formant 1
ramp 1 4 80.00 1.00 1516 1516 // Formant 2
ramp 1 5 80.00 1.00 2500 2500 // Formant 3
note 142 1 101 0 80.00 0.90 1 0.03 // Buzz1
note 143 1 119 0 80.00 0.90 0.1 // RMS1
note 144 1 121 0 80.00 0.90 // Mouth1
note 145 1 199 0 80.00 0.90 // Rebalance1

// seat
ramp 1 2 81.00 1.00 220 220 // A3
ramp 1 1 81.00 1.00 6000 6000 // Amplitude
ramp 1 3 81.00 1.00 344 344 // Formant 1
ramp 1 4 81.00 1.00 2170 2170 // Formant 2
ramp 1 5 81.00 1.00 2660 2660 // Formant 3
note 146 1 101 0 81.00 0.90 1 0.03 // Buzz1
note 147 1 119 0 81.00 0.90 0.1 // RMS1
note 148 1 121 0 81.00 0.90 // Mouth1
note 149 1 199 0 81.00 0.90 // Rebalance1

// of
ramp 1 2 82.00 1.00 246.9 246.9 // B3
ramp 1 1 82.00 1.00 6000 6000 // Amplitude
ramp 1 3 82.00 1.00 414 414 // Formant 1
ramp 1 4 82.00 1.00 1516 1516 // Formant 2
ramp 1 5 82.00 1.00 2500 2500 // Formant 3
note 150 1 101 0 82.00 0.90 1 0.03 // Buzz1
note 151 1 119 0 82.00 0.90 0.1 // RMS1
note 152 1 121 0 82.00 0.90 // Mouth1
note 153 1 199 0 82.00 0.90 // Rebalance1

// a
ramp 1 2 83.00 1.00 261.6 261.6 // C4
ramp 1 1 83.00 1.00 6000 6000 // Amplitude
ramp 1 3 83.00 1.00 414 414 // Formant 1
ramp 1 4 83.00 1.00 1516 1516 // Formant 2
ramp 1 5 83.00 1.00 2500 2500 // Formant 3
note 154 1 101 0 83.00 0.90 1 0.03 // Buzz1
note 155 1 119 0 83.00 0.90 0.1 // RMS1
note 156 1 121 0 83.00 0.90 // Mouth1
note 157 1 199 0 83.00 0.90 // Rebalance1

// bi-
ramp 1 2 84.00 0.98 293.7 293.7 // D4
ramp 1 2 84.98 0.02 293.7 246.9 // D4-B3
ramp 1 1 84.00 0.30 6000 6000 // Amplitude
ramp 1 3 84.00 0.30 800 800 // Formant 1
ramp 1 4 84.00 0.30 1228 1228 // Formant 2
ramp 1 5 84.00 0.30 2500 2500 // Formant 3
ramp 1 1 84.30 0.30 6000 6000 // Amplitude
ramp 1 3 84.30 0.30 800 344 // Formant 1
ramp 1 4 84.30 0.30 1228 2170 // Formant 2
ramp 1 5 84.30 0.30 2500 2660 // Formant 3
ramp 1 1 84.60 0.40 6000 6000 // Amplitude
ramp 1 3 84.60 0.40 344 344 // Formant 1
ramp 1 4 84.60 0.40 2170 2170 // Formant 2
ramp 1 5 84.60 0.40 2660 2660 // Formant 3
note 158 1 101 0 84.00 1.00 1 0.03 // Buzz1
// -cy-
ramp 1 2 85.00 0.98 246.9 246.9 // B3
ramp 1 2 85.98 0.02 246.9 196 // B3-G3
ramp 1 1 85.00 0.95 6000 6000 // Amplitude
ramp 1 3 85.00 0.95 344 344 // Formant 1
ramp 1 4 85.00 0.95 2170 2170 // Formant 2
ramp 1 5 85.00 0.95 2660 2660 // Formant 3
ramp 1 1 85.95 0.05 6000 6000 // Amplitude
ramp 1 3 85.95 0.05 344 414 // Formant 1
ramp 1 4 85.95 0.05 2170 1516 // Formant 2
ramp 1 5 85.95 0.05 2660 2500 // Formant 3
note 159 1 101 158 85.00 1.00 1 0 // Buzz1
// -cle
ramp 1 2 86.00 1.00 196 196 // G3
ramp 1 1 86.00 1.00 6000 6000 // Amplitude
ramp 1 3 86.00 1.00 414 414 // Formant 1
ramp 1 4 86.00 1.00 1516 1516 // Formant 2
ramp 1 5 86.00 1.00 2500 2500 // Formant 3
note 160 1 101 159 86.00 0.90 1 0 // Buzz1
note 161 1 119 0 84.00 2.90 0.1 // RMS1
note 162 1 121 0 84.00 2.90 // Mouth1
note 163 1 199 0 84.00 2.90 // Rebalance1

// built
ramp 1 2 87.00 2.00 220 220 // A3
ramp 1 1 87.00 2.00 6000 6000 // Amplitude
ramp 1 3 87.00 2.00 344 344 // Formant 1
ramp 1 4 87.00 2.00 2170 2170 // Formant 2
ramp 1 5 87.00 2.00 2660 2660 // Formant 3
note 164 1 101 0 87.00 1.90 1 0.03 // Buzz1
note 165 1 119 0 87.00 1.90 0.1 // RMS1
note 166 1 121 0 87.00 1.90 // Mouth1
note 167 1 199 0 87.00 1.90 // Rebalance1

// for
ramp 1 2 89.00 1.00 146.8 146.8 // D3
ramp 1 1 89.00 1.00 6000 6000 // Amplitude
ramp 1 3 89.00 1.00 414 414 // Formant 1
ramp 1 4 89.00 1.00 1516 1516 // Formant 2
ramp 1 5 89.00 1.00 2500 2500 // Formant 3
note 168 1 101 0 89.00 0.90 1 0.03 // Buzz1
note 169 1 119 0 89.00 0.90 0.1 // RMS1
note 170 1 121 0 89.00 0.90 // Mouth1
note 171 1 199 0 89.00 0.90 // Rebalance1

// two
ramp 1 2 90.00 6.00 196 196 // G3
ramp 1 1 90.00 6.00 6000 6000 // Amplitude
ramp 1 3 90.00 6.00 277 277 // Formant 1
ramp 1 4 90.00 6.00 553 553 // Formant 2
ramp 1 5 90.00 6.00 2420 2420 // Formant 3
note 172 1 101 0 90.00 3.00 1 0.03 // Buzz1
note 173 1 119 0 90.00 3.00 0.1 // RMS1
note 174 1 121 0 90.00 3.00 // Mouth1
note 175 1 199 0 90.00 3.00 // Rebalance1

// (rest)

end 96.0
Listing 4: Notelist for “Daisy Bell”, Iteration #2. Vowels and diphthongs are highlighted in green. To hear a realization, click here.

“Daisy Bell” Iteration #2

Listing 4 presents the second-iteration synthesis of “Daisy Bell”. Notice that the notes are exactly the same as in Listing 2, as are the ramp statements for the “Frequency” contour (contour #2 of voice #1). Likewise the “Amplitude” contour (contour #1 of voice #1) behaves as before, holding constant at 6000 throughout the duration of the song.

What's new here is the active role of the three vocal-resonance contours, “Formant 1” (contour #3 of voice #1), “Formant 2” (contour #4 of voice #1), and “Formant 3” contour (contour #5 of voice #1). These indications are color-coded in green. Notice that all formant data passes to the instruments via ramp statements rather than via note parameters.

To encode the formant data for a simple vowel requires a block of three ramp statements all occupying the same time segment (that is, all sharing the same start time and duration). See, for example the three ramps starting at time 3.00 and holding for 2 seconds. These produce the vowel i, which Listing 3 specifies as the appropriate vowel for the second syllable of “daisy”.

To encode the formant data for a diphthong requires three blocks, each encompassing three ramp statements. See, for example, the three blocks starting at time 0.00. The first block starts at time 0.00 and holds for 200 msec.; the second block starts at time 0.20 and transitions for 300 msec.; the third block starts at time 0.50 and holds for 2.5 seconds. These three blocks produce the diphthong , which Listing 3 specifies for the first syllable of “daisy”. The outer blocks are steady-state, while the inner block effects the diphthong proper.

Next topic: Glides and Liquids

© Charles Ames Page created: 2014-02-20 Last updated: 2017-06-12