Research

This section aims to identify the major research trends within which sound and music computing is to be situated. The focus is on trends in ICT, the cognitive sciences and the humanities. Given the broad scope of sound and music computing, this section devotes special attention to the rise and importance of a multidisciplinary research space that is the motor for innovation in society. References to this general research space can be found in several reports from the European Commission and the National Science Foundation of the US (European Commission – New Instruments, 2004, European Commisison – Research Area, 2007, National Science Foundation, 2004). Below, we aim at identifying the specific research trends that are deemed to be relevant to sound and music computing. Each trend will be summarized by a short statement that aims at identifying the core issue that is relevant to future challenges of the sound and music computing research field.

Research Trend 1: Rapid progress in ICT

In recent decades, progress in sound and music computing has been driven by revolutionary developments in information technology. The transitions from analogue to digital data processing and from wired to wireless mobile data-communication have been key components in this development. The relentless rate of annual to biennial doubling of storage capacity, bandwidth and data-crunching computing power has been unprecedented in history, leading to fundamental transformation of all aspects of the production-processing-distribution-consumption chain of sound and music content. In no other field has an entire processing chain been digitalised and made available on broadband networks and mobile devices on such a massive scale. In this development, technological progress has had a direct empowering influence both on scientific knowledge and on end applications, which in turn have impacted on the development of new technologies. In the context of this development, a number of consequences for research can be identified (Hengeveld, 2000, ITRS Consortium, 2007, NEM Consortium, 2007, Microsoft Research Cambridge, 2006).

Statement 1: The rate of increase in storage capacity, bandwidth and data-crunching computing power is leading to fundamental transformations in all aspects of the music economical chain.

First, the increasing capacity of data storage and transfer supports the accumulation of, and easy access to, ever-larger volumes of data. A resulting benefit is better access to knowledge, such as online access to vintage publications, supplementary data, new publication formats and so on. This accessibility is empowering to the scientist. At the same time, like the invention of printing, it has an effect on the embodiment of knowledge itself, shifting the centre of gravity of knowledge from brain, to book and onward to database.

A second effect is the shift towards data intensive methodologies that involve gathering or compilation of large volumes of data. This allows a focus on data intensive phenomena, phenomena that are either intrinsically complex, or else accessible only as patterns within multiple or complex observations. In the field of music studies, an enquiry might call for the processing of a large library of musical scores or a large database of audio data. A few years ago such a quest might have remained untouched for lack of access to the data, or no room to store it in the computer, or no time to wait for the computer to give an answer. By the same token, a topic once respectable for its technical difficulty might suddenly become trivial. In ways such as these, information technology affects the focus of science.

A third consequence is the shift away from analytical and theoretical approaches towards a reliance on computer models and simulations. This approach, which can be observed in fields as diverse as pure mathematics (computational proofs), statistics (Montecarlo methods, bootstrap), biology (DNA sequence alignment), linguistics and speech engineering (data-driven methods), has engendered a degree of unease and debate (Seiden, 2001). Does a proof that only a computer can follow really contribute to our understanding? Similar unease met the invention of algorithms, infinity, or proof by induction. In similar vein, one can ask whether a drum machine can be qualified as a musician? Or whether 'jazz improvisation' by a computer is really a genuine improvisation?

A fourth consequence is the development of machine-embedded knowledge such as that gathered by machine-learning techniques. Arguably these techniques come closer to delivering the promises of intelligence than has the so-called Artificial Intelligence (AI) research itself. With them, intelligence is attained more by the clever use of tricks and devices in machines than by the artifice of man. At the confluence of statistical estimation techniques and neural network theory, machine learning harnesses the computer to compile and extract regularities from massive quantities of data. The knowledge thus obtained, usually impossible to describe to a human brain and useless without a computer, is nonetheless empowering for web search, spam filtering, or musical content indexing and retrieval. As models of brain processing, machine-learning techniques may eventually provide a bridge between information technology and neurosciences. Particularly relevant to music technology are new techniques of signal processing related to machine learning.

In summary, progress in information science and technology is fuelling a drive towards data- and computation-intensive approaches to knowledge acquisition and problem solving, particularly in domains relevant to sound and music computing. These have deep implications for the nature of scientific and technological knowledge and how it is brought to bear on our needs.

Statement 2: Information technology is profoundly reshaping the methodologies and scope of scientific inquiry and technological development.

Research Trend 2: Cognitive science: from musical mind to brain

Cognitive sciences (Wilson and Keil, 1999) focus on how humans interact with their environment, mostly from the viewpoint of perception and action. Developments in this research domain have had a huge impact on sound and music computing. In fact, studies on musical memory, learning and all activities related to music perception and action, such as extraction of high-level information from musical stimuli or gestural sound control, can be considered the basic constituents of sound and music computing applications.

The cognitive science of music (as practised in, for example, cognitive musicology, experimental music psychology or the neurosciences of music) has its focus on the semantic gap that exists between our daily meaningful experiences with sound and music on the one hand, and the encoded physical energy of sound and music on the other. When dealing with music, we call upon content and meaning, whereas the encoded physical energy is just a way of storing information in a technological device. How are the two connected? How can we access the encoded information by means of meaningful actions? Research in cognitive science aims at providing new insights into this semantic gap problem. Several different approaches to solving this problem can be distinguished.

A first research direction starts out from the premise that the human mind is embodied (Knoblich et al., 2006). Rather than trying to solve the semantic gap problem by looking at formal structures and higher-level or low-dimensional representational spaces, the relation between human meaning and encoded physical energy is here seen as being mediated by the human body. For example, if an ambiguous musical rhythm is presented, then it is assumed that the motor system of the human body engenders the anticipatory mechanisms (called emulation) that allow a disambiguated auditory perception of it. Action is here seen as a crucial component for auditory perception, with action and feedback mechanisms being considered at different processing levels, from feedback mechanisms in the auditory periphery (e.g. the role of outer hair cells in attenuation) to the role of intended actions in perception. The embodied viewpoint may revolutionise how we think about ICT development in that it calls for new technologies that mediate between the human mind and its musical environment, based on a multi-sensory approach to sound and music computing (Leman, 2007).

Statement 3: The embodied viewpoint calls for new technologies to mediate between the human mind and the environment.

A second research direction is concerned with the methodologies for acquiring knowledge about the semantic gap problem. In the last decade, these methodologies have been extended from behavioural to brain research. Knowledge about the brain is progressing rapidly and at multiple scales which include molecular, synaptic, cellular, cell assembly, and regional and functional anatomy as revealed by brain imaging. Today our tools include molecular biology techniques for probing the membrane and synaptic properties of neurons, physiological recording techniques to observe entire neuronal assemblies, non-invasive imaging techniques to probe activity within the human brain, computational tools to gather and process the resulting data, and theoretical tools to make sense of the complexity of what is observed. Some recent studies in neurophysiology include the use of awake preparations (often coupled with behavioural studies), multiple unit recordings, simultaneous invasive and non-invasive brain imaging techniques (to calibrate one with respect to the other), selective brain cooling, optical imaging and the coupling of one of these with genetic engineering or biochemical manipulations to probe specific stages in processing. Research in brain imaging includes the use of higher magnetic fields for structural and functional MRI (magnetic resonance imaging), increased numbers of channels in EEG (electroencephalography) or MEG (magnetoencephalography), simultaneous recording of fMRI and EEG, or EEG and MEG, and use of pre-surgical supradural or intracortical recordings from patients to obtain 'close up' snapshots of brain activity.

An important facilitating factor in these developments is progress in hardware and software techniques for handling and interpreting the massive data sets produced by brain imaging. In short, there is presently a rapid development of different technology-driven methodologies that provide new insights into how the brain is involved in the semantic gap problem.

Statement 4: New technology-driven methodologies are providing new insights into how the human brain processes sound and music.

A third major research effort, situated in theoretical neurosciences, is about the tight interaction between signal processing and machine learning techniques on the one hand, and models of neural processing on the other. A common goal is to find techniques that can harness the extreme complexity of relevant patterns in data (for example databases of environmental, speech or musical sounds) or the structures and mechanisms observed within the brain. The computer here is used as an aid to control a degree of complexity of which our brains cannot otherwise easily comprehend. One promising angle of enquiry is the use of data-driven methods to simulate the processing mechanisms (natural or artificial) under the drive of the data patterns that it is to process. This method can be used as an alternative or complement to more traditional engineering techniques.

The above developments lead to often rather wild speculations on the possible future benefits of neurosciences to computing. An example of such a hypothetical breakthrough might be the possibility of 'downloading' entire cognitive or perceptual processing mechanisms to software. This could result from a combination of progress in recording techniques, theoretical neurosciences and machine learning. Another hypothetical breakthrough (heralded by well-estabished cochlear implant technologies and recent experiments with animal models and impaired humans) could be the widespread development of brain-machine interfaces (BMI). This could result from a combination of progress in interface hardware (e.g. miniaturised electrode arrays), signal processing (to factor out 'noise') and machine learning (to translate between the different codes used by brain and machine). All this is likely to have a huge impact on the sound and music computing field. Examples are hearing aids (e.g. cochlear implants) that allow their users to listen to music at a high quality level, or an intracortical implant that would allow a quadriplegic to play the piano.

Statement 5: Cognitive sciences and neurosciences offer a rapidly expanding window on the human mind and brain, thereby providing new possibilities for solving the semantic gap problem.

Research Trend 3: From subjective experience to cultural content

Research in the humanities is focused on signification practices; that is, on how human beings make sense of their environment and give meaning to their lives. The humanities view this signification practice from a subjective and experiential point of view. Therefore, research of this kind includes anthropology, area studies, communications, cultural studies and media studies. The humanities not only provide insights into these aspects but also train people in the skills necessary for practitioners (e.g. in music playing, painting, film making). Traditionally, research methodologies in the humanities are based on analytic, descriptive, critical or even speculative and imitation approaches, although recent approaches also involve quantitative and empirical studies (e.g. Diamond, 1999; Foster, 1985; Tomasello, 1999). In the cultural and creative industries (KEA, 2006), the humanities can provide the content needed to develop a significant partnership between culture and technology.

Several research efforts in the humanities address this issue. A first approach has adopted the belief that subjective factors (related to gender, education and social and cultural background) play a central role in how people deal with technology. Humanities research may provide the necessary analysis of the role of subjective factors and the social and cultural contexts in which technological applications will function. Knowledge of these factors needs to be incorporated into music retrieval systems and interactive music systems.

Statement 6: Subjective factors play a central role in how people deal with technology in relation to sound and music.

A second research approach is concerned with what is sometimes called 'medialogy'; that is, an approach which combines technology and creativity to design new processes and tools for art, design and entertainment. It involves insight into the creative processes, thoughts and tools needed for media-productions and other arts to exist. Clearly, medialogy is at the crossroads of the human sciences, the creative arts and technology. As such, it is a central pillar of the creative industries.

A third research approach is concerned with the transformation of the cultural sector into the digital domain. This involves the digitalisation of a large part of our cultural heritage. From the humanities point of view, the preservation and archiving of cultural heritage poses huge challenges with respect to issues such as the authenticity of documents, flexible multi-language access and the provision of proper content descriptions of objects from multifarious cultures.

Statement 7: Technology, creative approaches to art, design and entertainment and the digitalisation of a large part of our cultural heritage stimulate each other.

A fourth key topic in the humanities concerns the role of the human body, embodiment, and corporeal skills in signification practices. Human skills, which often require intensive learning, have been studied and described for centuries from a humanistic point of view, often from entirely different cultural perspectives. Accordingly, the humanities provide a rich source of theories, concepts and traditions that are highly revealing and inspiring for new empirical studies and technological applications. An example is the Laban theory of effort (Laban and Lawrence, 1947), which provides a speculative theory but very valuable insight into choreography and expressive moving. This theory can be straightforwardly related to music perception, leading to the interesting approach of gesture-based music retrieval. Another example concerns the philosophical views on intentional behaviour of the human body and how this is currently being integrated into a neuroscientific approach to empathy and social cognition (Metzinger, 2003). The focus on the human body in artistic research is clearly connected with the empirical study of embodiment in cognitive science. In fact, it is thanks to the humanities (e.g. phenomenology, post-structuralism, post-modernism) that this topic has become a genuine research topic on the agenda of empirical sciences that deal with perception, action and the use of tools and technologies. Indeed, some aspects of embodiment, involving emotions and the gesture related to them, can be straightforwardly explored and used in technology-based artistic and cultural applications, even if our knowledge about these processes is limited.

In short, the humanities offer a very rich background from which the problem of the semantic gap can be addressed. Its focus on specific topics such as the human subject, embodiment and social and cultural interaction, along with its often descriptive analytic approach, is highly valuable from the perspective of content creation.

Statement 8: The humanities offer the cultural background and content for sound and music computing research.

Research Trend 4: The rise of multidisciplinary research

Scientific research is currently witnessing two opposing, though intimately related, approaches. On the one hand, it continues to differentiate into more and more specific and narrowly circumscribed sub-fields owing to the accelerating accumulation of ever more specific knowledge. At the same time, new multidisciplinary research fields are emerging within academia, for example in the life sciences, neurosciences and earth sciences. Understanding the complex phenomena facing mankind - from climate change to new epidemics to global economic and social developments - requires the integration of expertise from many fields. The growing importance of multidisciplinarity is being increasingly recognised in research funding agencies and educational organisations.

According to a report recently presented at the OECD Global Science Forum Workshop (National Institutes for Health [NIH], 2006) “[t]he increasing multidisciplinary nature of research [...] is an important overall trend in science policy. For example, during the past four years, the fraction of interdisciplinary research at the United States National Science Foundation has increased significantly”. The NIH Roadmap for Medical Research further states that “the traditional divisions [...] may in some instances impede the pace of scientific discovery”. In response to this, the NIH is establishing “a series of awards that make it easier for scientists to conduct interdisciplinary research”.

As early as the year 2000, The National Sciences and Engineering Research Council [NSERC] of Canada set up a special Advisory Group on Interdisciplinary Research (AGIR) with a mandate to study how interdisciplinary research could be better supported (NSERC, 2002). In 2003, the National Science Foundation [NSF] promoted a study on the convergence of technologies (NSF, 2003) which concluded that: “In the early decades of the 21st century, concentrated efforts can unify science based on the unity of nature, thereby advancing the combination of nanotechnology, biotechnology, information technology, and new technologies based in cognitive science”. Similarly, research funding institutions all over the world are beginning to recognise the need to give special attention to multidisciplinary research funding.

Of course, the fundamental importance of multidisciplinary research is also acknowledged by the European Commission. In the field of ICT, which is of direct relevance to sound and music computing, the “Future and Emerging Technologies” (FET) programme is explicitly targeted towards innovative, multidisciplinary work - in the chapter dedicated to FET, the ICT work programme of FP7 calls for “interdisciplinary explorations of new and alternative approaches towards future and emerging ICT-related technologies, aimed at a fundamental reconsideration of theoretical, methodological, technological and/or applicative paradigms in ICT”, one of the goals of FET being to “[help] new interdisciplinary research communities to establish themselves as bridgeheads for further competitive RTD” (ICT-FET Work Programme, 2007).

Sound and music computing is by definition an multidisciplinary field, ranging from the natural sciences like physics and acoustics through mathematics, statistics and computing, all the way to physiology, psychology and sociology. The global trend towards the recognition of multidisciplinarity should help sound and music computing establish itself more confidently as an encompassing discipline that studies a phenomenon of central relevance to humans in all its necessary breadth. In addition, the emergence of new multidisciplinary fields of research and application is producing new points of contact for sound and music computing.

A prime example of such contact is the current rise of the so-called creative industries (KEA, 2006). While the notion creative industries refers to a sector of the economy, its current upsurge (also in terms of public awareness) also leads to new opportunities for creative multidisciplinary research at the intersection of art, design and technology. Sound and music computing can and will play an important role here. The case of the creative industries also highlights once more - if that were needed - the close ties between scientific research and the arts (see also the Industrial Context section). Artistic visions coupled with creative application ideas are likely to drive sound and music computing research in more ways than can currently be envisioned, resulting in entirely new environments, devices and cultural services.

Statement 9: Multidisciplinary research is increasingly seen as a necessity and an asset, and special programmes for fostering and funding it are being developed. Sound and music computing can take advantage of this and should actively seek alliances with other disciplines, including the arts.

To sum up, in this section we have identified some major trends related to the rapid progress in ICT, the development of cognitive science and the advent of brain science, the role of human sciences in addressing the human subject and its action-related contexts, and the multidisciplinary nature of scientific research. Sound and music research is at the cutting edge of these trends. It is driven by these general trends in research and it plays an active role in pushing the most advanced stages of each of these developments.

References

Diamond, J. (1999). Guns, germs, and steel. The fates of human societies. New York: Norton & Comp.

European Commission – Research Area (2007). European research area. http://cordis.europa.eu/era/ .

European Commission – New Instruments (2004). Evaluation of the effectiveness of the New Instruments of Framework Programme VI. http://ica.cordis.lu/documents/documentlibrary/ADS0006763EN.pdf

Foster, H. (Ed.) (1985). Postmodern culture. London and Sydney: Pluto Press.

Hengeveld, P, Best, J.-P. , van Beumer, J. , Hooff, B. , van den Poot, H. , and R. de Westerveld (Eds.) (2000). Research trends in information and communication technology: Uncovering the agendas for the information age.Telematica Instituut, Enschede, The Netherlands.

ICT-FET Work Programme (2007). Future and Emerging Technologies. European commission. Information society and media. http://cordis.europa.eu/ist/fet/

ITRS Consortium (2007). The international technology roadmap for semiconductors. http://www.itrs.net/home.html.

KEA European Affairs [KEA] (2006). The economy of culture in Europe. Technical report. http://www.keanet.eu/Ecoculture/ecoculturepage.htm.

Laban, R., & Lawrence, F. C. (1947). Effort. London: Macdonald & Evans.

Leman, M. (2007). Embodied music cognition and mediation technology. Cambridge, MA: The MIT Press.

Metzinger, T. (2003). Being no one: The self-model theory of subjectivity. Cambridge, Mass.: MIT Press.

Microsoft Research Cambridge (2006). Towards 2020 Science. http://research.microsoft.com/towards2020science/.

National Institutes for Health [NIH] (2006). Interdisciplinary Research. http://nihroadmap.nih.gov/interdisciplinary.

National Science Foundation (2004). National Science Foundation Strategic Plan FY 2003-2008. http://www.nsf.gov/pubs/2004/nsf04201/FY2003-2008.pdf

Natural Sciences and Engineering Research Council [NSERC] (2002). First report of the advisory group on interdisciplinary research. http://www.nserc.ca/pubs/agir/AGIR_e_report.pdf.

NEM Consortium (2007). Networked and electronic media - European technology platform: Strategic research agenda. http://www.nem-initiative.org/.

Seiden, S. (2001). Can a computer proof be elegant? Communications of the ACM, 32: 111-114, 2001.

Tomasello, M. (1999). The cultural origins of human cognition. Cambridge: Harvard University Press.

Wilson, R.A. and Keil, F. (Eds.) (1999). The MIT Encyclopedia of the cognitive sciences (MITECS). Cambridge, Mass.: The MIT Press.