What are Formants?
Formants are the characteristic amplitude peaks in the spectrum of resonant sound sources. They result due to excitation of fixed resonant chambers and are the most significant contribution to the timbre of tonal instruments. In speech, they are present below 5000 Hz, and they are usually “in-harmonic,” meaning their frequencies are not integer multiple of each other.
Guitars with even the slightest differences in body dimensions sound different because their characteristic resonances (formants) are fixed to different frequencies. For this reason, if you were to look at the same exact performance on two different guitars through separate spectrum analyzers, you would notice that there are some resonant peaks in different positions.
Formants are the reason the timbre of woodwind and brass instruments slightly change depending on the player’s valve position. What is more, different positions produce different sets of formants because the valves modify the dimensions of the resonant chamber, in the same way, that using different valves (vocal folds) in our vocal tract produce different vowel sounds. It is a fact that without the ability to change the dimensions of their resonant vocal and nasal cavities, humans would only make one resonant sound, one vowel.
To illustrate, below are graphical analyses from the MIT Press of the location of amplitude peaks over time for some English words. The red areas indicate the areas with the most energy.
Producing vowel sounds requires resonance more than producing consonants. For example, compare vowel sounds to a flute and consonants to drums. It is important to remember that each vowel sound is characterized by a different set of formants and can be synthetically imposed on a complex sound using several resonant filters. So, take a look at this graph from the “Subtractive Synthesis Concepts” chapter in Ed Doering’s Musical Signal Processing with LabVIEW that nicely lays out the approximate formant frequencies for vowels.
- Open a synthesizer that is capable of producing a sawtooth wave.
- Send the output into an EQ that allows you to use up to 3 resonant peak-filters at once.
- Turn the resonance or “Q” to the highest value possible.
- Boost frequencies F1, F2, F3 for some of the vowel sounds in the chart above.
- Adjust the amplitudes of each peak and listen for where the vowel sound jumps out.
To conclude, here are two examples of this technique. The unfiltered sawtooth is played first for a moment. In the next bar, the EQ is enabled.
Example “ah” as in “hot”
Example “oo” as in “boot”
**In order to make vowel sounds using multiple resonant filters, the sound source needs to contain frequency content in the range of the formants you chose to impose. For example, to hear F1 for the vowel “oo”, there must be energy in 300Hz, 870Hz, and 2410Hz. A safe source to use is a sawtooth wave because it contains every harmonic.
Doering. Formant Vowel Synthesis. November 2007. http://firstname.lastname@example.org:4/Formant-Vowel-Synthesis
Huckvale. UCL Psychology and Language Sciences. Resources in speech, hearing, and phonetics. July 2015. http://www.phon.ucl.ac.uk/home/johnm/sid/formant.htm
Joe Wolfe. Music Acoustics. University of South Wales. http://newt.phys.unsw.edu.au/jw/formant.html
Nave. Vocal tract resonance. http://hyperphysics.phy-astr.gsu.edu/hbase/music/vocres.html
Schnupp. King. Nelken. Auditory Neuroscience. Formants and Harmonics in Spoken Vowels. MIT press. https://auditoryneuroscience.com/topics/formants-and-harmonics-spoken-vowels