sotonDH Small Grants: Investigation into Synthesizer Parameter Mapping and Interaction for Sound Design Purposes – Post 2 by Darrell Gibson
Research Group: Composition and Music Technology
In the previous blog post three research questions were presented in relationship to how synthesizers are used for sound design: First, is there a way that sound design can be performed without an in-depth knowledge of the underlying synthesis technique? Second, can a large number of synthesizer parameters be controlled intuitively with a set of interface controls that relate to the sounds themselves? Finally, can multiple sets of complex synthesizer parameters be controlled and explored simultaneously?
Over the years there has been significant research in the area of synthesizer programming, which can be separated into two foci. First, improving the programming interface so that the details of the underlying techniques are not visible, but can be controlled. Second, the automatic programming of a synthesizer to try to replicate target sounds.
Synthesizer Programming Interfaces
As previously mentioned, the programming interface that a synthesizer presents to the user is often a direct mapping of the synthesis parameters rather than related to the output sound, and follows directly from original hardware synthesizers such as the Moog Modular . Various proposed solutions examine the mapping of the synthesizer parameters between the synthesis engine and the programming interface to see if the relationship can be more intuitive and less technical.
Interpolated Parameter Mapping
Several researchers have developed systems that will interpolate between parameters, or sets of parameters, via a user interface. Work in this area was first completed at GRM in the 1970’s and 80’s, where the SYTER system was developed , . This system used a X-Y graphical interface to control the relationship between different parameters of the synthesizer engine. The X-Y positions of points on the graphical interface were mapped to the parameters using a gravitational model and the user could explore different interpolations between the parameters. The number of parameters controlled on the X-Y plane can be expanded by defining different origins for the position calculations of each parameter . As mentioned previously, these systems use a gravity model to define a circle of influence for the interpolation function. A later system called Interpolator used a light model, where an angle could be specified to define an interpolation zone . This gives extra flexibility and when an angle of 360° is used the traditional circular model can also be achieved. Interpolation techniques were expanded in the implementation of Metasurface, which allows the definition of multiple origins in an X-Y plane and then uses spatial interpolation technique, called Natural Neighbourhood Interpolation . This creates a polygon for each parameter and then gives a weighting value to each that corresponds to area taken from adjacent polygons, resulting in the smooth control of the assigned parameters. Other geometric manipulations of the parameter space have been suggested , resulting in the implementation of a multi-Layer mapping system that has been used to map both geometric position and gestures, to control the sound, via a drawing tablet . The principle of using an X-Y plane has also been advanced with the use of multi-point touch screen interfaces, which allows the relationship between multiple points to be used to map advanced multi-touch gestures .
Timbre Space and Perceptual Mappings
Although the interpolation systems examined in the previous section do give a more intuitive way of managing complex synthesizer programming control structures, they do not necessarily relate to the perception of the sound produced. In 1975 Grey defined “Timbre Space” based on a 3D space using a three-way multidimensional scaling algorithm called INDSCAL to position 16 timbres in the space. The first axis is interpreted as the spectral energy of the sound, the second dimension is temporal behavior in the attack stage between the upper harmonics, and the third is the spectral fluctuation, which relates to the articulatory nature of the instrument . These principles were expanded on in 1979 by Wessel, from IRCAM, who showed that a 2D timbre space could be used to control the mapping of synthesizer parameters . Later in the mid 1990s a system called, Intuitive Sound Editing Environment (ISEE) developed by Vertegaal from the University of Bradford, used a hierarchical structure for timbre space, based on a taxonomy of musical instruments. This allowed changes in timbre, that require numerous parameter changes, to be generated by relocating the sound within the timbre space hierarchy , .
Although not directly related to timbre space in 1996, at the University of Paris, Rolland developed a system for capturing the expertise of sound designers, programming a synthesizer, by using a model of knowledge representation. This was not based the attributes of the sound structures themselves, but on the manipulations or variations that can be applied to them. These transformation procedures were then defined using adjective terms such as “brighter’” or “warmer”. This means classification of a sound according to the transformations that can be applied to it, rather than the properties of the sound itself. This resulted in a hierarchical network of sounds and connection between them, which define the transformations that are required to modify between them . Seawave, developed at Michigan State University in 1994 by Ethington, was a similar system that allowed an initial synthesizer patch to be modified using controls that are specified using timbral adjectives . More recently, in 2006 Gounaropoulos at University of Kent produced a system that used a list of adjectives to provide an input, which was mapped via a trained neural network . The user could then adjust the sound using controls allocated to the timbral adjectives. Aramaki in 2007 then showed that a similar mapping process can be applied to percussive sounds, based on different materials and the type of impact .
Nicol in 2005 was the first to propose the use of multiple timbre spaces, with one being generated from listening test and another that is drawn from acoustic parameters . In a comprehensive body of work Seago expanded this idea and has recently presented a synthesizer interface that allows the design and exploration of a timbre space, using a system of weighted centroid localization , .
Work is continuing in generating more accurate representations of perceptual adjectives and hence definition of timbre space, recent examples being , , . Potentially this will result in a more controllable mapping between a synthesis engine and timbre space.
One of the unique features of synthesizer technology compared with traditional instruments is that they present two interfaces to the user, one for the programming of the sound generator and the other for the actual musical input. However, during a performance the user can potentially interact with either, or both interfaces. Therefore, the mapping between these two interfaces will ultimately affect the expressiveness of the synthesizer as an instrument. With both interpolated parameter mapping and timbre space mapping systems, the quantities mapped to the performance interface will ultimately affect the expressiveness of the instrument . As a result, the expressive control of both systems has been considered extensively.
Winkler in the mid 1990s considered the mapping of different body movements as expressive gesture control of Interactive Computer Music . Although the mapping to a synthesis engine was not considered, it demonstrated the notion of capturing movements for the control of performer expression. Along similar lines, in 2001 Camurri presented a framework for capturing and interpreting movement gestures . This framework is built around the notion that a “multi-layer” system is required to take physical input signals captured from movement sensor, and map them to interpreted gestures. The framework allows different formats for the input signals, such as, time variant sampled audio signals, sampled signals from tactile, infra-red sensors, signals from haptic devices, or events such as MIDI messages or low-level data frames in video. Around the same time, Arfib highlighted not only the need for gestural control, but also a visual feedback mechanism from the expression so that the performer can learn to use the expressiveness available . This work has then been expanded with a multi-layer mapping strategy based on the definition of a “perception space” that allowed multi-modal feedback  and in a subsequent paper specific examples are given .
In 2003 Hunt defined a “many-to-one” mapping that uses fewer layers, but claims to offer more expressiveness . Then in 2004, Wanderley reviewed gesture control of sound synthesis and presented simulated results of the various constituent parts of a Digital Musical Interface (DMI) that are mapped to digital audio effects and computer synthesized sounds . Next adaptive control was added  and trajectories were used as the input stimulus .
Work currently being undertaken by Caramiaux is looking at synthesizing sounds that have a direct similarity to the gesture used to generate it . In this way, specific sounds can be accessed with specific gestures in an intuitive way .
Being able to morph a synthesizer between multiple sounds in real-time is not new concept, but in most cases it is created as a simple cross-fade between two or more different patches. Recently some more complex ways of morphing a synthesizer between different sounds have been proposed where points in the parameter space representing desirable sounds can be controlled. In this way a path or trajectory can be defined in the parameter space so it is possible to morph the multiple sets of parameter in a specific order.
Ssynth was developed by Verfaille in 2006 at McGill University and is a real-time additive synthesizer that allows “additive frames” to be arranged as a 3-D mesh. Trajectories can then be used to morph between different sounds . Also in 2006 Pendharkar suggests another form of parameterized morphing where desired parameters can be selected from the parameter spaces and using a control signal, interpolation can then be performed between multiple sets of parameters . This allows points in the parameter space representing desirable sounds to be parameterized with high-level controls. The choice of end points of the morph and the extent of the morph can be to synthesis parameters. Aramaki also used a similar process in 2007 to morph between different sounds (materials) in a percussive synthesizer .
In 2010 Wyse proposed a system called Instrumentalizer that allows synthesis algorithms to be controlled with traditional instrument controls for things such pitch and expression. The system then maps these controls to the synthesis parameters and allows morphing to permit typical instrumental expressions . Another example presented by Brandtsegg in 2011 is a modulation matrix that enables interpolation between different tables of matrix coefficients . This permits the morphing of the modulators mappings, allowing the sound produced to be morphed.
An alternative mechanism that can be used to program synthesizers exploits resynthesis techniques. The basic premise is that a “target” sound is supplied and the system attempts to replicate the target sound with a synthesis engine. These techniques are either used for recreating sounds without having to understand the synthesis engine or to populate a search space for sound design. Resynthesis approaches can be separated into two categories: one analyses the target sound and then based directly on the results from the analysis, the synthesis engine is programmed. The other category uses Artificial Intelligence (AI) methods to program the synthesizer based on analysis of the supplied target.
The idea of analysis and resynthesis is not new and has been implemented many times using frequency analysis of the target sound and additive synthesis to build a representation of the target spectrum . A popular technique for the implementation of the analysis stage has been the use of a Phase Vocoder , although other techniques do exist. Over the years this basic premise has been refined many times. A recent example of this was presented by Kreutzer in 2008, who proposed an efficient additive-based resynthesis engine that claims to provide larger flexibility for the user and reduces the number of synthesis parameters compared to traditional methods . In addition to the work being done to refine the synthesis process, others have also examined how the process is driven. An example from 2008 is PerceptSynth, which is controlled with perceptually relevant high-level features, such as pitch and loudness . In 2008 Sethares also presented tools for manipulation of the spectral representations of sounds between analysis and re-synthesis . This then gives a mechanism to dynamically change the tonality of the sound and create morphing effects.
Using additive resynthesis principles, TAPESTREA created by Misra in 2009, is a complete sound design framework that facilitates the synthesis of new sounds from supplied audio recordings, through interactive analysis, transformation and resynthesis . It then allows complex audio scenes to be constructed from the resynthesized results, using a graphical interface. Klingbeil has also showed in 2009, with a resynthesis system called SPEARS, that the principles can be used for compositional purposes .
Over the last few years much work has been published on the analysis of acoustic audio features, of the sort used in music information retrieval and other sound analysis applications , . These techniques are now being applied to a resynthesis paradigm. In this manner Hoffman, in 2006, presented a framework for synthesizing audio with sets of quantifiable acoustic features that have been extracted from supplied audio content . Although not technically resynthesis, similar analysis has been applied to a corpus-based concatenative synthesis technique by Schwarz in 2008 called CataRT . This allows user-driven parameter settings to be generated based on these forms of audio analysis.
Artificial Intelligence Techniques
Artificial Intelligence has become increasing popular in the area of synthesizer programming. An early knowledge-based system by Miranda, 1995, called ISSD (Intelligent System for Sound Design), represented sounds in terms of their attributes (brightness, openness, compactness, acuteness, etc.) and how these attributes map to subtractive synthesis parameters for formants . In 1998, Miranda further expanded this idea and implemented a system called ARTIST  and applied it to different synthesis algorithms. More recently there has been much work on the use of Evolutionary Computational (EC) techniques for the programming of synthesizers. In 2001 Garcia developed a system where Genetic Programming (GP) was used to design a population of synthesis topologies, consisting of oscillators, filters, etc. The sounds generated by individuals in the population were then evaluated to establish how closely they matched the target , . Another AI technique that has been employed for synthesizer programming is the use of Genetic Algorithms (GA). These have been used to search large parameter spaces for target sounds, based on user interactions . Then in 2003 Johnson refined this so that the new population was generated based on fitness proportionate selection, where the higher the fitness rating given, the more likely it is to be selected as a parent . GAs have also been used with fuzzy logic to allow the user to make explicit associations between twelve visual metaphors presented by a particular sound . McDermott, 2005 – 2008 proposed a new interface for the design for interactive EC, which allows faster evaluation of large numbers of individuals from the population , , . As well as these interactive systems, in 2008 Yee-King presented an unsupervised synthesizer programmer, called SynthBot . This was able to automatically find the subtractive synthesis parameter settings necessary to produce a sound similar to a given target, using a GA. In addition, in a recent study by Dykiert, 2011, GAs have been suggested as a mechanism to reduce the size of the parameter search space . Finally, it should be noted that as well as synthesizer programming, in 2004 Miranda has also shown how EC can be applied to the compositional process .
- Jenkins, M. Analog Synthesizers: Understanding, Performing, Buying. Focal Press, 2007.
- Allouis, J., and Bernier, J. Y. The SYTER project: Sound processor design and software overview. In Proceedings of the 1982 International Computer Music Conference (ICMC), 232–240, 1982.
- Geslin, Y., Digital Sound and Music Transformation Environments: A Twenty-year Experiment at the “Groupe de Recherches Musicales”. Journal of New Music Research, Volume 31, Issue 2, 2002.
- Goudeseune, C., Interpolated Mappings for Musical Instruments. Organised Sound, 7(2):85–96, 2002.
- Spain, M., and R. Polfreman, Interpolator: a two- dimensional graphical interpolation system for the simultaneous control of digital signal processing parameters. Organised Sound, Volume 6, Issue 2, 2001.
- Bencina, R., The Metasurface: Applying Natural Neighbor Interpolation to Two-to-Many Mappings. In Proceedings of 2005 Conference on New Interfaces for Musical Expression (NIME), pages 101–104, 2005.
- Van Nort, D., M. Wanderley and Philippe Depalle. On the Choice of Mappings based on Geometric Properties. Proceedings of the 2004 International Conference on New Interfaces for Musical Expression (NIME 04), Hamamatsu, Japan, June 3-5, 2004.
- Van Nort, D, and M. Wanderley, Control Strategies for Navigation of Complex Sonic Spaces. Proceedings of the International Conference on New Interfaces for Musical Expression 2007 (NIME-07), New York, NY, June 2007.
- Schlei, K., Relationship-Based Instrument Mapping of Multi-Point Data Streams Using a Trackpad Interface. Proceedings of the 2010 Conference on New Interfaces for Musical Expression (NIME 2010), Sydney, Australia, 15-18, June 2010.
- Grey, J., An Exploration of Musical Timbre. PhD thesis, Department of Psychology, Stanford University.
- Wessel, D., Timbre Space as a Musical Control Structure. Computer Music Journal, Vol. 3, no. 2, pages 45–52, 1979.
- Vertegaal, R., and E. Bonis, ISEE: An Intuitive Sound Editing Environment. Computer Music Journal, Volume 18, Issue 2, pages 21-29, 1994.
- Vertegaal, R., and B. Eaglestone, Comparison of input devices in an ISEE direct timbre manipulation task. Interacting with Computers, Butterworth- Heinemann, 1996.
- Rolland P-Y., and Pachet F., A Framework for Representing Knowledge about Synthesizer Programming, Computer Music Journal, Volume 20, Issue 3, pages 47-58, MIT Press, 1996.
- Ethington, R. and B. Punch, SeaWave: A System for Musical Timbre Description. Computer Music Journal, Volume 18, Issue 1, pages 30-39, 1994.
- Gounaropoulos, A., and C. Johnson, Synthesising Timbres and Timbre- Changes from Adjectives/Adverbs. Applications of Evolutionary Computing, Volume, 3907 of Lecture Notes in Computer Science, pages 664-675, 2006.
- Aramaki M., Kronland-Martinet R., Voinier Th. And S. Ystad, Timbre Control Of Real-Time Percussive Synthesizer. Proceedings of 19th International Congress On Acoustics Madrid, 2-7 September 2007, Madrid, Spain.
- Nicol, C. A., Development and Exploration of a Timbre Space Representation of Audio. Ph.D from Department of Computing Science. Glasgow, University of Glasgow, 2005.
- Seago, A., A new user interface for musical timbre design. Ph.D in Music Computing, Open University, 2009.
- Seago, A., S. Holland, S., and P. Mulholland, A Novel User Interface for Musical Timbre Design. 128th Convention of the Audio Engineering Society, London, May 2010.
- Zacharakis, A., K. Pastiadis, G. Papadelis and J. Reiss, An Investigation of Musical Timbre: Uncovering Salient Semantic Descriptors and Perceptual Dimensions. 12th International Society for Music Information Retrieval Conference (ISMIR 2011), pages 807-812.
- Burred, J.J., A. Röbel and X. Rodet, An Accurate Timbre Model for Musical Instruments and its Application to Classification. Proceedings of International Workshop on Learning the Semantics of Audio Signals (LSAS), Athens, Greece, December 2006.
- Loureiro, M. A., H. B. D. Paula and H. C. Yehia, Timbre Classification Of A Single Musical Instrument. In C. L. Buyoli & R. Loureiro (Eds.), Electronic Engineering, pages 546-549.
- Hunt, A., M. Wanderley, and M. Paradis, The Importance Of Parameter Mapping In Electronic Instrument Design. Journal of New Music Research, Volume 32, Issue 4, page 429–440, 2003.
- Winkler, T., Making Motion Musical: Gestural Mapping Strategies For Interactive Computer Music. In Proceedings of 1995 International Computer Music Conference, pages 261–264.
- Camurri C., De Poli G., Leman M., Volpe G., A Multi-layered Conceptual Framework for Expressive Gesture Applications. Workshop on Current Research Directions in Computer Music, Barcelona, Spain, pp. 29-34, 2001.
- Arfib, D. and Kessous, L., Gestural Control of Sound Synthesis and Processing Algorithms. Lecture Notes in Computer Science 2002, Volume 2298, pages 55-85.
- Arfib, D., Couturier, J.M., Kessous, L. & Verfaille, V., Strategies of Mapping Between Gesture Data and Synthesis Model Parameters Using Perceptual Spaces. Organised Sound Volume 7, Issue 2, pages 127-144, 2002.
- Arfib, D., Courturier, J.M. and L. Kessous, Expressiveness and Digital Musical instrument Design. Journal of New Music Research, Volume 34, Issue 1, pages 125-136, 2005.
- Hunt, A. & Wanderley, M.M., Mapping Performer Parameters to Synthesis Engines. Organised Sound, Volume 7, Issue 2, pages 97–108, 2003.
- Wandelrey, M. M., and P. Depalle. 2004. Gestural Control of Sound Synthesis. Proceedings of the IEEE, Volume 92, Issue 4, Special Issue on Engineering and Music – Supervisory Control and Auditory Communication, G. Johannsen, Editor, pages. 632-644, 2004.
- Verfaille, V., Wanderley, M. and Depalle, P., 2006.Mapping Strategies for Gestural and Adaptive Control of Digital Audio Effects. Journal of New Music Research, Volume 35, Issue 1, pages 71-93.
- Van Nort, D. & Wanderley, M.M., 2006. Exploring the Effect of Mapping Trajectories on Musical Performance. In: Proceedings of the 2006 Conference on New Instruments for Musical Expression (NIME 06), June 4 – 8, 2006, Paris, France.
- Caramiaux, B., F. Bevilacqua and N. Schnell, Study on Gesture-Sound Similarity. 3rd Music and Gesture Conference, McGill University, Montreal, 2010.
- Caramiaux, B., F. Bevilacqua, N. Schnell. “Sound Selection by Gestures” New Interfaces for Musical Expression (NIME 2011), Oslo, Norway, 2011.
- Verfaille, V., J. Boissinot, P. Depalle, and M. M. Wanderley, Ssynth: A Real Time Additive Synthesizer With Flexible Control. Proceedings of the International Computer Music Conference (ICMC’06), New Orleans, 2006.
- Pendharkar, C., Gurevich, M., and Wyse, L. Parameterized Morphing As A Mapping Technique For Sound Synthesis. Proceedings of the 9th International Conference on Digital Audio Effect. Montreal, Canada. 2006, 1-6.
- Wyse, L., and N. Dinh Duy Instrumentalizing Synthesis Models. Proceedings of the International Conference on New Interfaces for Musical Expression, June 15-18, 2010, Sydney, Australia.
- Brandtsegg, Ø, S. Saue, & T. Johansen, A Modulation Matrix for Complex Parameter Sets. Proceedings of the 11th International Conference on New Interfaces for Musical Expression, 30 May – 1 June 2011, Oslo, Norway.
- Grey, J., and J. Moorer, Perceptual Evaluations Of Synthesized Musical Instrument Tones, Journal of Acoustical Society of America, Volume 62, Issue 2, pages 454-462, 1977.
- Moorer, J., The Use of the Phase Vocoder in Computer Music Applications, Audio Engineering Journel, Volume 26, Issue 1, 1978.
- Kreutzer, C., J. Walker and M. O’Neill, A Parametric Model for Spectral Sound Synthesis of Musical Sounds, Proceedings of the International Conference on Audio, Language and Image Processing (ICALIP) 2008, pages 633-637.
- Le Groux, S., and P. Verschure, Perceptsynth: Mapping Perceptual Musical Features to Sound Synthesis Parameters, Proceedings of the 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), March 30 – April 4, 2008, Las Vegas, USA.
- Sethares, W.A, A.J. Milne, S. Tiedje, A. Prechtl, and J. Plamondon, Spectral Tools for Dynamic Tonality and Audio Morphing. Computer Music Journal, Summer 2009, Volume 33, Issue 2, pages 71-84.
- Misra, A., P. R. Cook and G. Wang, A New Paradigm For Sound Design, Proceedings of 9th International Conference on Digital Audio Effects (DAFx-06), Montreal, Canada, September 18-20, 2006.
- Klingbeil, M., Spectral Analysis, Editing, and Resynthesis: Methods and Applications, Doctor of Musical Arts in the Graduate School of Arts and Sciences, Columbia University, 2009.
- Quackenbush, S., and A. Lindsay, Overview of MPEG-7 Audio, IEEE Transactions On Circuits And Systems For Video Technology, Volume 11, Number 6, June 2001.
- Casey, M., General Sound Classification and Similarity in MPEG-7, Organised Sound, Volume 6, Issue 2, Cambridge University Press, 2002.
- Hoffman, M., and P. Cook, Feature-Based Synthesis: Mapping from Acoustic and Perceptual Features to Synthesis Parameters, Proceedings of the International Computer Music Conference, New Orleans, 2006.
- Schwarz, D., R. Cahen and S. Britton, Principles And Applications Of Interactive Corpus-Based Concatenative Synthesis, Journées d’Informatique Musicale (JIM). Albi : Mars 2008.
- Miranda, E. R., An Artificial Intelligence Approach to Sound Design, Computer Music Journal, Volume 19, Issue 2, pages 59-75, 1995.
- Miranda, E. R., Machine Learning and Sound Design, Leonardo Music Journal, Volume 7, pages 49-55, 1998.
- Garcia, R. A., Growing Sound Synthesizers Using Evolutionary Methods, Proceedings of ALMMA 2001: Artificial Life Models for Musical Applications Workshop (ECAL).
- Garcia, R. A., Automating The Design Of Sound Synthesis Techniques Using Evolutionary Methods, Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-01), Limerick, Ireland, December 6-8, 2001.
- Dahlstedt, P., Creating and Exploring Huge Parameter Spaces: Interactive Evolution as a Tool for Sound Generation, Proceedings of 2001 International Computer Music Conference, Havana, Cuba, ICMA.
- Johnson, C. G., Exploring Sound-Space With Interactive Genetic Algorithms. Leonardo Music Journal, Volume 36, Issue 1, pages 51-54, 2003.
- Schatter, G., E. Zuger and C. Nitschke, A Synaesthetic Approach For A Synthesizer Interface Based On Genetic Algorithms And Fuzzy Sets. Proceeding of International Computer Music Conference 2005.
- McDermott, J., N. J. L. Griffith and M. O’Neill, Toward User-Directed Evolution of Sound Synthesis Parameters, Applications on Evolutionary Computing, Springer, pages 517– 526, 2005.
- McDermott, J., N. J. L. Griffith and M. O’Neill, Evolutionary GUIs for Sound Synthesis. Proceedings of Fifth European Workshop on Evolutionary Music and Art (EvoMUSART), 2007.
- McDermott, J., N. J. L. Griffith and M. O’Neill, Interactive EC Control of Synthesized Timbre, Evolutionary Computation, Volume 18, Issue 2, pages 277–303, 2010.
- Yee-King, M., and M. Roth, SynthBot: An Unsupervised Software Synthesizer Programmer, International Computer Music Conference 2008.
- Dykiert, M., and N.E. Gold, Support for Learning Synthesiser Programming, Proceedings of 8th Sound and Music Computing Conference 2011, 6 – 9 July 2011, Padova, Italy.
- Miranda, E. R. At the Crossroads of Evolutionary Computation and Music: Self- Programming Synthesizers, Swarm Orchestras and the Origins of Melody. Evolutionary Computation, Volume 12, Issue 2, pages 137-158, 2004.