“Spatial Audio - Practical Master Guide” is a free online course on spatial audio content creation. The target group are persons who have basic knowledge on audio production but are not necessarily dedicated experts in the underlying technologies and aesthetics. “Spatial Audio - Practical Master Guide” will be released on the Acoucou platform chapter-by-chapter all through Spring 2025. Some course content is already available as a preview.
The course comprises a variety of audio examples and interactive content that allow for the learners to develop their skills in a playful manner. The entire spectrum from psychoacoustics via the underlying technologies to delivery formats is covered. The course’s highlights are the 14 case studies and step-by-step guides that provide behind-the-scenes information. Many of the course components are self-sufficient so that they can be used in isolation or be integrated into other educational contexts.
The workshop on “Spatial Audio - Practical Master Guide” will provide an overview of the course contents, and we will explain the educational concepts that the course is based on. We will demonstrate the look and feel of the course on the Acoucou platform by demonstrating a set of representative examples from the courseware and provide the audience with the opportunity to experience it themselves. The workshop will wrap up with a discussion of the contexts in which the course contents may be useful besides self-study.
Course contents: Chapter 1: Overview (introduction, history of spatial, evolution of aesthetics in spatial audio) Chapter 2: Psychoacoustics (spatial hearing, perception of reverberation) Chapter 3: Reproduction (loudspeaker arrays, headphones) Chapter 4: Capture (microphone arrays) Chapter 5: Ambisonics (capture, reproduction, editing of ambisonic content) Chapter 6: Storing spatial audio content Chapter 7: Delivery formats
Case studies: Dolby Atmos truck streaming, fulldome, ikosahedral loudspeaker, spatial audio sound installation, spatial audio at Friedrichstadt Palast, spatial audio in the health industry, live music performance with spatial audio, spatial audio in automotive
Step-by-step guides: setting up your spatial audio workstation, channel-based production for music, dolby atmos mix for cinema, ambisonics sound production for 360 film, build your own ambisonic microphone array, interactive spatial audio
Artificial intelligence (AI) tools are transforming the way music is being produced. The rate of development is expeditious, and the associated metamorphosis of audio education is abrupt. Higher-level education is largely built around the objectives of knowledge transmission and skills development, evidenced by the emphasis on learning in the cognitive domain in University programmes. But the cohort of skills that music producers will require in five years’ time is unclear, making skills-based curriculum planning challenging. Audio educators require a systematic approach to integrate AI tools in ways that enhance teaching and learning.
This study uses speculative design as the underpinning research methodology. Speculative design employs design to explore and evaluate possible futures, alternative realities, and sociotechnical trends. In this study, the practical tasks in an existing university module are modified by integrating available GAI tools to replace or augment the task design. This tangible artefact is used to critique prevailing assumptions concerning the use of GAI in music production and audio education. The findings suggest that GAI tools will disrupt the existing audio education paradigm. Employing a process-centred approach to teaching and learning may represent a key progression for educators to help navigate these changes.
Cyclical formal reviews are essential to keep Music and Audio Technology degree programmes current. Whilst clear institutional guidance exists on the requisite documentation to be submitted, there is little guidance concerning the process used to gather the information. To address this issue, a 12 step collaborative and reflective framework was developed to review a degree programme in Music Technology.
This framework employs Walker’s ‘Naturalistic’ process model and design thinking principles to create a dynamic, stakeholder-driven review process. The framework begins with reflective analysis by faculty, helping to define program identity, teaching philosophy, and graduate attributes. Existing curricula are evaluated using Boehm et al.’s (2018) tetrad framework of Music Technology encompassing the sub-disciplines of production, technology, art, and science. Insights from industry professionals, learners, and graduates are gathered through semi-structured interviews, surveys, and focus groups to address skill gaps, learner preferences, and emerging trends. A SWOT analysis further refines the scope and limitations of the redesign process, which culminates in iterative stakeholder consultations to finalise the program’s structure, content, and delivery.
This process-centred approach emphasises adaptability, inclusivity, and relevance, thus ensuring the redesigned program is learner-centred and aligned with future professional and educational demands. By combining reflective practice and collaborative engagement, the framework offers a comprehensive, replicable model for educators redesigning degree programmes in the discipline. This case study contributes to the broader discourse on curriculum design in music and audio degree programmes, demonstrating how interdisciplinary and stakeholder-driven approaches can balance administrative requirements with pedagogical innovation.
Kevin Garland is a Postgraduate PhD Researcher at the Technological University of the Shannon: Midlands Midwest (TUS), Ireland. His primary research interests include human-computer interaction, user-centered design, and audio technology. Current research lies in user modelling and... Read More →
Friday May 23, 2025 9:35am - 9:55am CEST C1ATM Studio Warsaw, Poland
Acoustic Sovereignties (2024) is a First Nations, anti-colonial spatial audio exhibition held in Naarm (Melbourne), Australia. Through curatorial and compositional practices, Acoustic Sovereignties confronts traditional soundscape and Western experimental sound disciplines by foregrounding marginalised voices. As this research will show, the foundations of sound-based practices such as Deep Listening and Soundscape Studies consisted of romanticised notions of Indigenous spirituality, in addition to the intentional disregard for First Nations stewardship and kinship with the land and its acoustic composition. Acoustic Sovereignties aims at reclaiming Indigenous representation throughout sound-based disciplines and arts practices by providing a platform for voices, soundscapes and knowledge to be heard.
My name is Hayden Ryan, I am a First Nations Australian sound scholar and artist, and a 2024 New York University Music Technology Masters graduate. I am currently a Vice Chancellor's Indigenous Pre-Doctoral Fellow at RMIT University, where my PhD focuses on the integration of immersive... Read More →
Friday May 23, 2025 9:55am - 10:15am CEST C1ATM Studio Warsaw, Poland
The rapid advancement of generative artificial intelligence has created highly realistic DeepFake multimedia content, posing significant challenges for digital security and authenticity verification. This paper presents the development of a comprehensive testbed designed to detect counterfeit audio content generated by DeepFake techniques. The proposed framework integrates forensic spectral analysis, numerical and statistical modeling, and machine learning-based detection to assess the authenticity of multimedia samples. Our study evaluates various detection methodologies, including spectrogram comparison, Euclidean distance-based analysis, pitch modulation assessment, and spectral flatness deviations. The results demonstrate that cloned and synthetic voices exhibit distinctive acoustic anomalies, with forensic markers such as pitch mean absolute error and power spectral density variations serving as effective indicators of manipulation. By systematically analyzing human, cloned, and synthesized voices, this research provides a foundation for advancing DeepFake detection strategies. The proposed testbed offers a scalable and adaptable solution for forensic audio verification, contributing to the broader effort of safeguarding multimedia integrity in digital environments.
Content exchange and collaboration serve as catalysts for repository creation that supports creative industries and fuels model development in machine learning and AI. Despite numerous repositories, challenges persist in discoverability, rights preservation, and efficient reuse of audiovisual assets. To address these issues, the SCENE (Searchable multi-dimensional Data Lakes supporting Cognitive Film Production & Distribution for the Promotion of the European Cultural Heritage) project introduces an automated audio quality assessment toolkit integrated within its Media Assets Management (MAM) platform. This toolkit comprises a suite of advanced metrics, such as artifact detection, bandwidth estimation, compression history analysis, noise profiling, speech intelligibility, environmental sound recognition, and reverberation characterization. The metrics are extracted using dedicated Flask-based web services that interface with a data lake architecture. By streamlining the inspection of large-scale audio repositories, the proposed solution benefits both high-end film productions and smaller-scale collaborations. The pilot phase of the toolkit will involve professional filmmakers who will provide feedback to refine post-production workflows. This paper presents the motivation, design, and implementation details of the toolkit, highlighting its potential to assess content quality management and contribute to more efficient content exchange in the creative industries.
Dr. Nikolaos Vryzas was born in Thessaloniki in 1990. He studied Electrical & Computer Engineering in the Aristotle University of Thessaloniki (AUTh). After graduating, he received his master degrees on Information and Communication Audio Video Technologies for Education & Production... Read More →
When navigating the environment, we primarily rely on sight. However, in its absence, individuals must develop precise spatial awareness using other senses. A blind person can recognize their immediate surroundings through touch, but assessing larger spaces requires auditory perception. This project presents a method for auditory training in children with visual disabilities through structured audio plays designed to teach spatial pronouns and enhance spatial orientation via auditory stimuli. The format and structure of these audio plays allow for both guided learning with a mentor and independent exploration. Binaural recordings serve as the core component of the training exercises. The developed audio plays and their analyses are available on the YouTube platform in the form of videos and interactive exercises. The next step of this project involves developing an application that enables students to create individual accounts and track their progress. Responses collected during exercises will help assess the impact of the audio plays on students, facilitating improvements and modifications to the training materials. Additionally, linking vision-related questions with responses to auditory exercises will, over time, provide insights into the correlation between these senses. The application can serve multiple purposes: collecting research data, offering spatial recognition and auditory perception training, and creating a comprehensive, structured environment for auditory skill development.
This paper investigates the innovative synthesis of procedurally generated visual and auditory content through the use of Artificial Intelligence (AI) Tools, specifically focusing on Generative Pre-Trained Transformer (GPT) networks. This research explores the process of procedurally generating an audiovisual representations of semantic context by generating images, artificially providing motion and generating corresponding multilayered sound. The process enables the generation of stopped-motion audiovisual representations of concepts. This approach not only highlights the capacity for Generative AI to produce cohesive and semantically rich audiovisual media but also delves into the interconnections between visual art, music, sonification, and computational creativity. By examining the synergy between generated imagery and corresponding soundscapes, this research paper aims to uncover new insights into the aesthetic and technical implications of the use of AI in art. This research embodies a direct application of AI technology across multiple disciplines creating intermodal media. Research findings propose a novel framework for understanding and advancing the use of AI in the creative processes, suggesting potential pathways for future interdisciplinary research and artistic expression. Through this work, this study contributes to the broader discourse on the role of AI in enhancing creative practices, offering perspectives on how various modes of semantic representation can be interleaved using state-of-the-art technology.
We present G.A.D.A. (Guitar Audio Dataset for AI), a novel open-source dataset designed for advancing research in guitar audio analysis, signal processing, and machine learning applications. This comprehensive corpus comprises recordings from three main guitar categories: electric, acoustic, and bass guitars, featuring multiple instruments within each category to ensure dataset diversity and robustness.
The recording methodology employs two distinct approaches based on instrument type. Electric and bass guitars were recorded using direct recording techniques via DI boxes, providing clean, unprocessed signals ideal for further digital processing and manipulation. For acoustic guitars, where direct recording was not feasible, we utilized multiple microphone configurations at various positions to capture the complete acoustic properties of the instruments. Both recording approaches prioritize signal quality while maintaining maximum flexibility for subsequent processing and analysis.
The dataset includes standardized recordings of major and minor chords played in multiple positions and voicings across all instruments. Each recording is accompanied by detailed metadata, including instrument specifications, recording equipment details, microphone configurations (for acoustic guitars), and chord information. The clean signals from electric instruments enable various post-processing applications, including virtual amplifier modeling, effects processing, impulse response convolution, and room acoustics simulation.
To evaluate G.A.D.A.'s effectiveness in machine learning applications, we propose a comprehensive testing framework using established algorithms including k-Nearest Neighbors, Support Vector Machines, Convolutional Neural Networks, and Feed-Forward Neural Networks. These experiments will focus on instrument classification tasks using both traditional audio features and deep learning approaches.
G.A.D.A. will be freely available for academic and research purposes, complete with documentation, preprocessing scripts, example code, and usage guidelines. This resource aims to facilitate research in musical instrument classification, audio signal processing, deep learning applications in music technology, computer-aided music education, and automated music transcription systems.
The combination of standardized recording methodologies, comprehensive metadata, and the inclusion of both direct-recorded and multi-microphone captured audio makes G.A.D.A. a valuable resource for comparative studies and reproducible research in music information retrieval and audio processing.
Join a panel of professionals from a variety of fields in the industry as we discuss topics including how to enter the audio industry, how they each got started in their own careers and the path their careers took, and give advice geared towards students and recent graduates. Bring your questions for the panelists – most of this workshop will be focused the information YOU want to hear!
Coordinator & Professor, Audio Engineering & Music Technology, Kansas City Kansas Community College
Dr. Ian Corbett is the Coordinator and Professor of Audio Engineering and Music Technology at Kansas City Kansas Community College. He also owns and operates off-beat-open-hats LLC, providing live sound, recording, and audio production services to clients in the Kansas City area... Read More →
Friday May 23, 2025 4:30pm - 6:00pm CEST Hall FATM Studio Warsaw, Poland