Loading…
arrow_back View All Dates
Saturday, May 24
 

9:00am CEST

Strategies for Obtaining True Quasi-Anechoic Loudspeaker Response Measurements
Saturday May 24, 2025 9:00am - 9:20am CEST
Simple truncation of the reflections in the impulse response of loudspeakers measured in normal rooms will increasingly falsify the response below about 500 Hz for typical situations. Well-known experience and guidance from loudspeaker models allow the determination of the lowest frequency for which truncation suffices. This paper proposes two additional strategies for achieving much improved low-frequency responses that are complementary to the easily-obtained high-frequency response: (a) a previously published nearfield measurement which can be diffractively transformed to a farfield response with appropriate calculations, here presented with greatly simplified computations, and (b) a measurement setup that admits only a single floor reflection which can be iteratively corrected at low frequencies. Theory and examples of each method are presented.
Speakers
Saturday May 24, 2025 9:00am - 9:20am CEST
C1 ATM Studio Warsaw, Poland

9:00am CEST

A new one-third-octave-band noise criteria
Saturday May 24, 2025 9:00am - 9:20am CEST
A new one-third-octave-band noise criteria (NC) rating method is presented. one-third-octave-band NC curves from NC 70 to NC 0 are derived from the existing octave-band curves, adjusted for bandwidth, fit to continuous functions, and redistributed progressively over this space. This synthesis is described in detail. The diffuse field hearing threshold at low frequencies is also derived. Several NC curves at high frequencies are shown to be below threshold (inaudible). NC ratings are calculated using both the new one-third-octave-band and the legacy octave-band methods for a number of different room noise spectra. The resulting values were found to be similar for both methods. NC ratings using the new method are particularly applicable to very low noise level critical listening environments such as recording studios, scoring stages, and cinema screening rooms, but are shown to also be applicable to higher noise level environments. The proposed method better tracks the audibility of noise at low levels as well as the audibility of tonal noise components, while the legacy method as originally conceived generally emphasizes speech interference.
Saturday May 24, 2025 9:00am - 9:20am CEST
C2 ATM Studio Warsaw, Poland

9:00am CEST

Creating and distributing immersive audio: from IRCAM Spat to Acoustic Objects
Saturday May 24, 2025 9:00am - 10:00am CEST
In this session, we propose a path for the evolution of immersive audio technology towards accelerating commercial deployment and enabling rich user-end personalization, in any linear or interactive entertainment or business application. We review an example of perceptually based immersive audio creation platform, IRCAM Spat, which enables plausible aesthetically motivated immersive music creation and performance, with optional dependency on physical modeling of an acoustic environment. We advocate to alleviate ecosystem fragmentation by showing: (a) how a universal device-agnostic immersive audio rendering model can support the creation and distribution of both physics-driven interactive audio experiences and artistically motivated immersive audio content; (b) how object-based immersive linear audio content formats can be extended, via the notion of Acoustic Objects, to support end-user interaction, reverberant object substitution, or 6-DoF navigation.
Speakers
avatar for Jean-Marc Jot

Jean-Marc Jot

Founder and Principal, Virtuel Works LLC
Spatial audio and music technology expert and innovator. Virtuel Works provides audio technology strategy, IP creation and licensing services to help accelerate the development of audio and music spatial computing technology and interoperability solutions.
avatar for Thibaut Carpentier

Thibaut Carpentier

STMS Lab - IRCAM, SU, CNRS, Ministère de la Culture
Thibaut Carpentier studied acoustics at the École centrale and signal processing at Télécom ParisTech, before joining the CNRS as a research engineer. Since 2009, he has been a member of the Acoustic and Cognitive Spaces team in the STMS Lab (Sciences and Technologies of Music... Read More →
Saturday May 24, 2025 9:00am - 10:00am CEST
C4 ATM Studio Warsaw, Poland

9:00am CEST

Key Technology Briefing 4
Saturday May 24, 2025 9:00am - 10:30am CEST
Saturday May 24, 2025 9:00am - 10:30am CEST
C3 ATM Studio Warsaw, Poland

9:00am CEST

Tutorial Workshop: The Gentle Art of Dithering
Saturday May 24, 2025 9:00am - 10:45am CEST
This tutorial is for everyone working on the design or production of digital audio and should benefit beginners and experts. We aim to bring this topic to life with several interesting audio demonstrations, and up to date with new insights and some surprising results that may reshape pre-conceptions of high resolution.
In a recent paper, we stressed that transparency (high-resolution audio fidelity) depends on the preservation of micro-sounds – those small details that are easily lost to quantization errors, but which can be perfectly preserved by using the right dither.
It is often asked: ‘Why should I add noise to my recording?’ or, ‘How can adding noise make things clearer?’ This tutorial gives a tour through these questions and presents a call to action: dither should not be looked on as an added noise, but an essential lubricant to preserves naturalness.

Tutorial topics include: fundamentals of dithering; analysis using histograms and synchronous averaging; what happens if undithered quantizers are cascaded?; ‘washboard distortion’; noise-shaping; additive and subtractive dither; time-domain effects; inside A/D and D/A converters; the perilous world of modern signal chains (including studio workflow and DSP in fixed and floating-point processors) and, finally, audibility analysis.
Saturday May 24, 2025 9:00am - 10:45am CEST
Hall F ATM Studio Warsaw, Poland

9:00am CEST

Extension of Reflection-Free Region for Loudspeaker Measurements
Saturday May 24, 2025 9:00am - 10:45am CEST
If loudspeaker measurements are carried out elevated over a flat, very reflective surface with no nearby obstacles, the recovered impulse response will contain the direct response and one clean delayed reflection. Many loudspeakers are omnidirectional at low frequencies, having a clear acoustic centre, and this reflection will have a low-frequency behaviour that is essentially the same as its direct response, except the amplitude will be down by a 1/r factor. We derive a simple algorithm that iteratively allows this reflection to be cancelled, so that the response of the loudspeaker will be valid to lower frequencies than before, complementing the usual high-frequency response obtained from simple time-truncation of the impulse response. The method is explained, discussed, and illustrated with a two-way system measured over a flat, sealed driveway surface.
Speakers
Saturday May 24, 2025 9:00am - 10:45am CEST
Hall F ATM Studio Warsaw, Poland

9:00am CEST

Impact of Voice-Coil Temperature on Electroacoustic Parameters for Optimized Loudspeaker Enclosure Design in Small-Signal Response
Saturday May 24, 2025 9:00am - 10:45am CEST
The study of electroacoustic parameters in relation to loudspeaker temperature has predominantly focused on large-signal conditions (i.e., high-power audio signals), with limited attention to their behavior under small-signal conditions at equivalent thermal states. This research addresses this gap by investigating the influence of voice-coil temperature on electroacoustic parameters during small-signal operation. The frequency response of the electrical input impedance and the radiated acoustic pressure were measured across different voice-coil temperatures. The results revealed temperature-dependent shifts across all parameters, including the natural frequency in free air (fₛ), mechanical quality factor (Qₘₛ), electrical resistance (Rₑ), electrical inductance (Lₑ), and equivalent compliance volume (Vₐₛ), among others. Specifically, Rₑ and Lₑ increased linearly with temperature, while fₛ decreased and Vₐₛ increased following power-law functions. These changes suggest that thermal effects influence both electrical and mechanical subsystems, potentially amplified by the viscoelastic “creep” effect inherent to loudspeaker suspensions. Finally, simulations of sealed and bandpass enclosures demonstrated noticeable shifts in acoustic performance under thermal variations, emphasizing the importance of considering temperature effects in enclosure design.
Saturday May 24, 2025 9:00am - 10:45am CEST
Hall F ATM Studio Warsaw, Poland

9:00am CEST

Material Characterization and Variability in Loudspeaker Membranes for Acoustic Modeling
Saturday May 24, 2025 9:00am - 10:45am CEST
Finite Element Method (FEM) simulations are vital in the design of loudspeakers, offering a more efficient alternative to traditional trial-and-error approaches. Precise material characterization, however, is essential in ensuring that theoretical models align closely with measurements. Variations in material properties, particularly those of a loudspeaker’s membrane, can significantly influence loudspeaker performance. This work aims to establish a methodology for evaluating the variability of loudspeaker membrane materials, specifically cones and surrounds, to better understand each materials repeatability among samples, and overall improve the precision and reliability of loudspeaker simulations.


The study first conducts an in-depth analysis of membrane materials, focusing on their Young’s modulus and density, by utilizing both empirical and simulated data. Subsequently, complete loudspeakers were built and investigated, utilizing membranes studied. A FEM simulation framework is presented, and observations are made into discrepancies between measured and simulated loudspeaker responses at specific frequencies and their relation to material modeling.

The results demonstrated significant alignment between simulations and real-life performances, showing interesting insights into the impact of small changes in material properties on the acoustic response of a loudspeaker. One significant finding was the frequency dependence of the Young’s modulus of fiberglass used for a cone. Further validation can be achieved by expanding the dataset of the materials measured, exploring more materials, and under varying conditions such as temperature and humidity. Such insights enable more accurate modeling of loudspeakers and lay the groundwork for exploring novel materials with enhanced acoustic properties, guiding the development of high-performance loudspeakers.
Speakers
avatar for Chiara Corsini

Chiara Corsini

R&D engineer, FAITAL [ALPS ALPINE]
Chiara has joined Faital S.p.A. in 2018, working as a FEM analyst in the R&D Department. Her research activities are focused on thermal phenomena associated with loudspeaker functioning, and mechanical behavior of the speaker moving parts. To this goal, she uses FEM and lumped parameter... Read More →
LV

Luca Villa

FAITAL [ALPS ALPINE]
RT

Romolo Toppi

FAITAL [ALPS ALPINE]
Saturday May 24, 2025 9:00am - 10:45am CEST
Hall F ATM Studio Warsaw, Poland

9:00am CEST

Shape Optimization of Waveguides for Improving the Directivity of Soft Dome Tweeters
Saturday May 24, 2025 9:00am - 10:45am CEST
This paper introduces a new algorithm for multiposition mixed-phase equalization of slot-loaded loudspeaker responses obtained in the horizontal and vertical plane, using finite impulse response (FIR) filters. The algorithm selects a {\em prototype response} that yields a filter that best optimizes a time-domain-based objective metric for equalization for a given direction. The objective metric includes a weighted linear combination of pre-ring energy, early and late reflection energy, and decay rate (characterizing impulse response shortening) during filter synthesis. The results show that the presented mixed-phase multiposition filtering algorithm performs a good equalization along all horizontal directions and for most positions in the vertical direction. Beyond the multiposition filtering capabilities, the algorithm and the metric are suitable for designing mixed-phase filters with low delays, an essential constraint for real-time processing.
Saturday May 24, 2025 9:00am - 10:45am CEST
Hall F ATM Studio Warsaw, Poland

9:00am CEST

Supervised Machine Learning for Quality Assurance in Loudspeakers: Time Distortion Analysis
Saturday May 24, 2025 9:00am - 10:45am CEST
Measuring a speaker’s ability to respond to an instantaneous pulse of energy will result in distortion at its output. Factors such as speaker geometry, material properties, equipment error, and the conditions of the environment will create artifacts within the captured data. This paper explores the extraction of time-domain features from these responses, and the training of a predictive model to allow for classification and rapid quality assurance.
Speakers
Saturday May 24, 2025 9:00am - 10:45am CEST
Hall F ATM Studio Warsaw, Poland

9:20am CEST

IMPro -- Method for Integrated Microphone Pressure Frequency Response Measurement Using a Probe Microphone
Saturday May 24, 2025 9:20am - 9:40am CEST
We propose a practical method for the measurement of the pressure sensitivity frequency response of a microphone that has been integrated into product mechanics. The method uses a probe microphone to do determine the sound pressure entering the inlet of the integrated microphone. We show that the measurements can be performed in a normal office environment as well as in anechoic conditions. The method is validated with measurement of a rigid spherical microphone prototype having analytically defined scattering characteristics. Our results indicate that the proposed method, called IMPro, can effectively measure the pressure sensitivity frequency response of microphones in commercial products, quite independent of the measurement environment.
Saturday May 24, 2025 9:20am - 9:40am CEST
C1 ATM Studio Warsaw, Poland

9:20am CEST

Mixed-Phase Equalization of Slot-loaded Impulse Responses
Saturday May 24, 2025 9:20am - 9:40am CEST
This paper introduces a new algorithm for multiposition mixed-phase equalization of slot-loaded loudspeaker responses obtained in the horizontal and vertical plane, using finite impulse response (FIR) filters. The algorithm selects a {\em prototype response} that yields a filter that best optimizes a time-domain-based objective metric for equalization for a given direction. The objective metric includes a weighted linear combination of pre-ring energy, early and late reflection energy, and decay rate (characterizing impulse response shortening) during filter synthesis. The results show that the presented mixed-phase multiposition filtering algorithm performs a good equalization along all horizontal directions and for most positions in the vertical direction. Beyond the multiposition filtering capabilities, the algorithm and the metric are suitable for designing mixed-phase filters with low delays, an essential constraint for real-time processing.
Speakers
avatar for Sunil Bharitkar

Sunil Bharitkar

Samsung Research America
Saturday May 24, 2025 9:20am - 9:40am CEST
C2 ATM Studio Warsaw, Poland

9:40am CEST

Non-invasive sound field sensing in enclosures using acousto-optics
Saturday May 24, 2025 9:40am - 10:00am CEST
It is challenging to characterize sound across space, especially in small enclosed volumes, using conventional microphone arrays.
This study explores acousto-optic sensing methods to record the sound field throughout an enclosure, including regions close to a source and boundaries.
The method uses a laser vibrometer to sense modulations of the refractive index in air, caused by the propagating sound pressure waves.
Compared to microphone arrays, the sound field can be measured non-invasively and at high resolution which is particularly attractive at high frequencies, in enclosures of limited size or unfavorable mounting conditions for fixtures.
We compensate for vibrations that contaminate and conceal the acousto-optic measurements and employ an image source model to also reconstruct early parts of the impulse response.
The results demonstrate that acousto-optic measurements can enable the analysis of sound field in enclosed spaces non-invasively and with high resolution.
Saturday May 24, 2025 9:40am - 10:00am CEST
C1 ATM Studio Warsaw, Poland

9:40am CEST

Analog Pseudo Leslie Effect with High Grade of Repeatability
Saturday May 24, 2025 9:40am - 10:00am CEST
This paper describes the design of an Analog Stomp Box capable of reproducing the effect observed when a loudspeaker is rotated during operation, the so-called Leslie effect. When the loudspeaker is rotating two physical effects can be observed: The first is a variation of the amplitude because sometimes the speaker is aimed at the observer and then, after 180 degrees of rotation, the loudspeaker is aimed opposing to the observer. To recreate this variation in amplitude, a circuit called Tremolo was designed to achieve this effect. The second is the Doppler effect, which was obtained with a circuit designed to vary the phase of the signal (Vibrato). The phase variation simulates a frequency variation for the ears. Assembling these two circuits in cascade, it is obtained the Pseudo Leslie Effect. These Vibrato and Tremolo circuits receive the control signal from a Low Frequency Oscillator (LFO) which controls the effect frequency. To get a high degree of repeatability, which is not simple in analog circuits employing photocouplers, those photocoupler devices were replaced with VCAs. The photocouplers have a great variation of your optical characteristics, so it is hard to obtain the same result in a large-scale production. However, using VCAs it turns to be easily achievable. The THAT2180 IC is a VCCS, Voltage-Controlled Current Source with an exponential gain control and low signal distortion. The term Pseudo was used because, in the Leslie Effect, the rotation of the loudspeaker gives a lag of 90o between the frequency and amplitude variations. This lag has not been implemented, but the sonic result left nothing to be desired.
Saturday May 24, 2025 9:40am - 10:00am CEST
C2 ATM Studio Warsaw, Poland

10:00am CEST

The Search for a Universal Microphone
Saturday May 24, 2025 10:00am - 10:20am CEST
Recording engineers and producers choose different microphones for different sound sources. It is intriguing that, in the 1950s and 1960s, the variety of available microphones was relatively limited compared to what we have available today. Yet, recordings from that era remain exemplary even now. The microphones used at the time were primarily vacuum tube models.
Through discussions at AES Conventions on improving phantom power supplies and my own experimentation with tube microphones myself, I began to realize that defining attribute of their sound might not stem solely from the tubes themselves. Instead, the type of power supply appeared to play a crucial role in shaping the final sound quality.
This hypothesis was confirmed with the introduction of high-voltage DPA 4003 and 4004 microphones, compared to their phantom-powered counterparts, the 4006 and 4007. In direct comparisons, the microphones with external, more current-efficient power supplies consistently delivered superior sound.
Having worked extensively with numerous AKG C12 and C24 microphones I identified two pairs, one of C12s and one of C24s with identical frequency characteristics. For one C12, we designed an entirely new, pure Class A transistor-based circuit with an external power supply.
Reflecting on my 50-plus years as a sound engineer and producer, I sought to determine which microphones were not only the best, but also the most versatile. My analysis led to four key solutions extending beyond the microphones themselves. Since I had already developed an ideal Class A equalizer, I applied the same technology to create four analog equalizers designed to fine-tune the prototype microphone’s frequency characteristics at the power supply level.
Speakers
Saturday May 24, 2025 10:00am - 10:20am CEST
C1 ATM Studio Warsaw, Poland

10:00am CEST

Computational Complexity Analysis of the K-Method for Nonlinear Circuit Modeling
Saturday May 24, 2025 10:00am - 10:20am CEST
In today's music industry and among musicians, instead of using analog hardware effects to alter sound, digital counterparts are increasingly being used, often in the form of software plugins. The circuits of musical devices often contain nonlinear components (diodes, vacuum tubes, etc.), which complicates their digital modeling. One of the approaches to address this is the use of state-space methods, such as the Euler or Runge-Kutta methods. To guarantee stability, implicit state-space methods should be used; however, they require the numerical solution of an implicit equation, leading to large computational complexity. Alternatively, the K-method can be used that avoids the need of numerical methods if the system meets certain conditions, thus significantly decreasing the computational complexity. Although the K-method has been invented almost three decades ago, the authors are not aware of a thorough computational complexity analysis of the method in comparison to the more common implicit state-space approaches, such as the backward Euler method. This paper introduces these two methods, explores their advantages, and compares their computational load as a function of model size by using a scalable circuit example.
Saturday May 24, 2025 10:00am - 10:20am CEST
C2 ATM Studio Warsaw, Poland

10:30am CEST

Student Recording Competition 4
Saturday May 24, 2025 10:30am - 11:30am CEST
Saturday May 24, 2025 10:30am - 11:30am CEST
C4 ATM Studio Warsaw, Poland

10:40am CEST

Immersive recordings in virtual acoustics: differences and similarities between a concert hall and its virtual counterpart
Saturday May 24, 2025 10:40am - 11:00am CEST
Virtual acoustic systems can artificially alter a recording studio's reverberation in real time using spatial room impulse responses captured in different spaces. By recreating another space's acoustic perception, these systems influence various aspects of a musician's performance. Traditional methods involve recording a dry performance and adding reverb in post-production, which may not align with the musician's artistic intent. In contrast, virtual acoustic systems allow simultaneous recording of both artificial reverb and the musician's interaction using standard recording techniques—just as it would occur in the actual space. This study analyzes immersive recordings of nearly identical musical performances captured in both real concert hall and McGill University's Immersive Media Lab (Imlab), which features a new dedicated virtual acoustics software, and highlights the similarities and differences between the performances recorded in the real space and its virtual counterpart.
Speakers
avatar for Gianluca Grazioli

Gianluca Grazioli

Montreal, Canada, McGill University
avatar for Richard King

Richard King

Professor, McGill University
Richard King is an Educator, Researcher, and a Grammy Award winning recording engineer. Richard has garnered Grammy Awards in various fields including Best Engineered Album in both the Classical and Non-Classical categories. Richard is an Associate Professor at the Schulich School... Read More →
Saturday May 24, 2025 10:40am - 11:00am CEST
C1 ATM Studio Warsaw, Poland
  Acoustics

10:40am CEST

A simplified RLS algorithm for adaptive Kautz filters
Saturday May 24, 2025 10:40am - 11:00am CEST
Modeling or compensating a given transfer function is a common task in the field of audio. To comply with the characteristics of hearing, logarithmic frequency resolution filters have been developed, including the Kautz filter, which has orthogonal tap outputs. When the system to be modeled is time-varying, the modeling filter should be tuned to follow the changes in the transfer function. The Least Mean Squares (LMS) and Recursive Least Squares (RLS) algorithms are well-known methods for adaptive filtering, where the latter has faster convergence rate with lower remaining error, at the expense of high computational demand. In this paper we propose a simplification to the RLS algorithm, which builds on the orthogonality of the tap outputs of Kautz filters, resulting in a significant reduction in computational complexity.
Saturday May 24, 2025 10:40am - 11:00am CEST
C2 ATM Studio Warsaw, Poland

10:45am CEST

Audio Post in the AI Future
Saturday May 24, 2025 10:45am - 12:15pm CEST
This panel discussion gathers professionals with a broad range of experience across audio post production for film, television and visual media. During the session, the panel will consider questions around how AI technology could be leveraged to solve common problems and pain-points across audio post, and offer opportunities to encourage human creativity, not supplant it.
Speakers
avatar for Bradford Swanson

Bradford Swanson

Head of Product, Pro Sound Effects
Bradford is the Head of Product at Pro Sound Effects, an industry leader in licensing audio for media and machine learning. Previously, he worked in product development at iZotope, Nomono, and Sense Labs, and toured for more than 12 years as a musician, production manager, and FOH... Read More →
Saturday May 24, 2025 10:45am - 12:15pm CEST
C3 ATM Studio Warsaw, Poland

11:00am CEST

Analysis of the acoustic impulse response of an auditorium
Saturday May 24, 2025 11:00am - 11:20am CEST
The acoustic behaviour of an auditorium is analysed after measurements performed according to the ISO 3382:1 standard. The all-pole analysis of the measured impulse responses confirms the hypothesis that all responses have a common component that can be attributed to room characteristis. Results from a subsequent non-parametric analysis allows conjecturing that the overall reponse of the acoustic channel between two points may de decomposed in three components: one related to source position, another related to the room, and the last one depending on the position of the receiver.
Saturday May 24, 2025 11:00am - 11:20am CEST
C1 ATM Studio Warsaw, Poland
  Acoustics

11:00am CEST

An Artificial Reverberator Informed by Room Geometry and Visual Appearance
Saturday May 24, 2025 11:00am - 11:20am CEST
Without relying on audio data as a reference, artificial reverberation models often struggle to accurately simulate
the acoustics of real rooms. To address this, we propose a hybrid reverberator derived from a room’s physical
properties. Room geometry is extracted via Light Detection and Ranging mapping, enabling the calculation of
acoustic reflection paths via the Image Source Method. Frequency-dependent absorption is found by classifying
room surface materials with a multi-modal Large Language Model and referencing a database of absorption
coefficients. The extracted information is used to parametrise a hybrid reverberator, divided into two components:
early reflections, using a tapped delay line, and late reverberation, using a Scattering Feedback Delay Network.
Our listening test results show that participants often rate the proposed system as the most natural simulation of a
small hallway room. Additionally, we compare the reverberation metrics of the hybrid reverberator and similar
state-of-the-art models to those of the small hallway.
Speakers
avatar for Joshua Reiss

Joshua Reiss

Professor, Queen Mary University of London
Josh Reiss is Professor of Audio Engineering with the Centre for Digital Music at Queen Mary University of London. He has published more than 200 scientific papers (including over 50 in premier journals and 6 best paper awards) and co-authored two books. His research has been featured... Read More →
Saturday May 24, 2025 11:00am - 11:20am CEST
C2 ATM Studio Warsaw, Poland

11:00am CEST

Loudness of movies for Broadcasting
Saturday May 24, 2025 11:00am - 12:00pm CEST
Broadcasting movies in linear TV or via streaming presents a considerable challenge, especially for highly dynamic content like action films. Normalising such content to the paradigm of "Programme Loudness" may result in dialogue levels much lower than the loudness reference level (-23 LUFS in Europe). On the other hand, normalising to the dialogue level may lead to overly loud sound effects. The EBU Loudness group PLOUD has addressed this issue with the publication of R 128 s4, the forth supplement to the core recommendation R 128. In order to have a better understanding of the challenge, an extensive analysis of 44 dubbed movies (mainly Hollywood mainstream films) has been conducted. These analysed films were already dynamically treated for broadcast delivery by experienced sound engineers. The background of the latest document of the PLOUD group will be presented and the main parameter LDR (Loudness-to-Dialogue-Ratio) will be introduced. A systematic approach when and how to proceed with dynamic treatment will be included.
Speakers
avatar for Florian Camerer

Florian Camerer

Senior Sound Engineer, ORF
Saturday May 24, 2025 11:00am - 12:00pm CEST
Hall F ATM Studio Warsaw, Poland

11:00am CEST

Students Project Expo
Saturday May 24, 2025 11:00am - 1:00pm CEST
Saturday May 24, 2025 11:00am - 1:00pm CEST
Hall F ATM Studio Warsaw, Poland

11:20am CEST

Sparsity-based analysis of sound field diffuseness in rooms
Saturday May 24, 2025 11:20am - 11:40am CEST
Sound fields in enclosures comprise a combination of directional and diffuse components. The directional components include the direct path from the source and the early specular reflections. The diffuse part starts with the first early reflection and builds up gradually over time. An ideal diffuse field is achieved when incoherent reflections begin to arrive randomly from all directions. More specifically, a diffuse field is characterized by having uniform energy density (i.e., independence from measurement position) and an isotropic distribution (i.e. random directions of incidence), which results in zero net energy flow (i.e. the net time-averaged intensity is zero). Despite this broad definition, real diffuse sound fields typically exhibit directional characteristics owing to the geometry and the non-uniform absorptive properties of rooms.

Several models and data-driven metrics based on the definition of a diffuse field have been proposed to assess diffuseness. A widely used metric is the _mixing time_, which indicates the transition of the sound field from directional to diffuse and is known to depend, among other factors, on the room geometry.

The concept of mixing time is closely linked to normalized echo density (NEDP), a measure first used to estimate the mixing time in actual rooms (Abel and Huang, 2006), and later to assess the quality of artificial reverberators in terms of their capacity to produce a dense reverberant tail (De Sena et al., 2015). NEDP is calculated over room impulse responses measured with a pressure probe, evaluating how much the RIR deviates from a normal distribution. Another similar temporal/statistical measure, kurtosis, has been used to similar effect (Jeong, 2016). However, neither NEDP nor kurtosis provides insights into the directional attributes of diffuse fields. While both approaches rely on statistical reasoning rather than identifying individual reflections, another temporal approach uses matching pursuit to identify individual reflections (Defrance et al., 2009).

Another set of approaches focuses on the net energy flow aspect of the diffuse field, providing an energetic analysis framework either in the time domain (Del Galdo et al., 2012) or in the time-frequency domain (Ahonen and Pulkki, 2009). These approaches rely on calculating the time-averaged active intensity, either using intensity probes or first- and higher-order Ambisonics microphones, where a pseudo-intensity-based diffuseness is computed (Götz et al., 2015). The coherence of spherical harmonic decompositions of the sound field has also been used to estimate diffuseness (Epain and Jin, 2016). Beamforming methods have likewise been applied to assess the directional properties of sound fields and to illustrate how real diffuse fields deviate from the ideal (Gover et al., 2004).

We propose a spatio-spectro-temporal (SST) sound field analysis approach based on a sparse plane-wave decomposition of sound fields captured using a higher-order Ambisonics microphone. The proposed approach has the advantage of analyzing the progression of the sound field’s diffuseness in both temporal and spatial dimensions. Several derivative metrics are introduced to assess temporal, spectro-temporal, and spatio-temporal characteristics of the diffuse field, including sparsity, diversity, and isotropy. We define the room sparsity profile (RSP), room sparsity relief (RSR), and room sparsity profile diversity (RSPD) as temporal, spectro-temporal, and spatio-temporal measures of diffuse fields, respectively. The relationship of this new approach to existing diffuseness measures is discussed and supported by experimental comparisons using 4th- and 6th-order acoustic impulse responses, demonstrating the dependence of the new derivative measures on measurement position. We conclude by considering the limitations and applicability of the proposed approach.
Saturday May 24, 2025 11:20am - 11:40am CEST
C1 ATM Studio Warsaw, Poland
  Acoustics

11:20am CEST

Direct convolution of high-speed 1 bit signal and finite impulse response
Saturday May 24, 2025 11:20am - 11:40pm CEST
Various AD conversion methods exist, and high-speed 1 bit method have been proposed with using a high sampling frequency and 1 bit quantization. The ΔΣ modulation is mainly used, and due to its characteristic, these signals are able to accurately preserve the spectrum of the analog signal and move quantization noise into higher frequency bands, which allows for a high signal-to-noise ratio in the audible range. However, When performing signal processing tasks such as addition and multiplication on high-speed 1 bit signals, it is generally necessary to convert them into multi-bit signals for arithmetic operations. In this paper, we propose a direct processing method for high-speed 1 bit signal without converting them into multi-bit signal and the convolution is realized. In this method, 1 bit data are reordered to achieve operations without arithmetic one. The proposed method was verified through the simulations with using low-pass FIR filters. Frequency-domain analysis showed that the proposed method achieved equivalent performance to conventional multi-bit convolutions with successfully performing the desired filtering. In this paper, we present a novel approach to directly processing high-speed 1 bit signals and suggest potential applications in audio and signal processing fields.
Speakers
Saturday May 24, 2025 11:20am - 11:40pm CEST
C2 ATM Studio Warsaw, Poland

11:40am CEST

Evaluating room acoustic parameters using ambisonic technology: a case study of a medium-sized recording studio
Saturday May 24, 2025 11:40am - 12:00pm CEST
Ambisonic technology has recently gained popularity in room acoustic measurements due to its ability to capture both general and spatial characteristics of a sound field using a single microphone. On the other hand, conventional measurement techniques conducted in accordance with the ISO 3382-1 standard require multiple transducers, which results in more time-consuming procedure. This study presents a case study on the use of ambisonic technology to evaluate the room acoustic parameters of a medium-sized recording studio.
Two ambisonic microphones, a first-order Sennheiser Ambeo and a third-order Zylia ZM1-3E, were used to record spatial impulse responses in 30 combinations of sound source and receiver positions in the recording studio. Key acoustic parameters, including Reverberation Time (T30), Early Decay Time (EDT) and Clarity (C80), were calculated using spatial decomposition methods. The Interaural Cross-Correlation Coefficient (IACC) was derived from binaural impulse responses obtained using the MagLS binauralization method. The results were compared with conventional omnidirectional and binaural microphone measurements to assess the accuracy and advantages of ambisonic technology. The findings show that T30, EDT, C50 and IACC values measured with the use of ambisonic microphones are consistent with those obtained from conventional measurements.
This study demonstrates the effectiveness of ambisonic technology in room acoustic measurements by capturing a comprehensive set of parameters with a single microphone. Additionally, it enables the estimation of reflection vectors, offering further insights into spatial acoustics.
Saturday May 24, 2025 11:40am - 12:00pm CEST
C1 ATM Studio Warsaw, Poland
  Acoustics

11:40am CEST

Knowledge Distillation for Speech Denoising by Latent Representation Alignment with Cosine Distance
Saturday May 24, 2025 11:40am - 12:00pm CEST
Speech denoising is a prominent and widely utilized task, appearing in many common use-cases. Although there are very powerful published machine learning methods, most of those are too complex for deployment in everyday and/or low resources computational environments, like hand-held devices, smart glasses, hearing aids, automotive platforms, etc. Knowledge distillation (KD) is a prominent way for alleviating this complexity mismatch, by transferring the learned knowledge from a pre-trained complex model, the teacher, to another less complex one, the student. KD is implemented by using minimization criteria (e.g. loss functions) between learned information of the teacher and the corresponding one from the student. Existing KD methods for speech denoising hamper the KD by bounding the learning of the student to the distribution learned by the teacher. Our work focuses on a method that tries to alleviate this issue, by exploiting properties of the cosine similarity used as the KD loss function. We use a publicly available dataset, a typical architecture for speech denoising (e.g. UNet) that is tuned for low resources environments and conduct repeated experiments with different architectural variations between the teacher and the student, reporting mean and standard deviation of metrics of our method and another, state-of-the-art method that is used as a baseline. Our results show that with our method we can make smaller speech denoising models, capable to be deployed into small devices/embedded systems, to perform better compared to when typically trained and when using other KD methods.
Saturday May 24, 2025 11:40am - 12:00pm CEST
C2 ATM Studio Warsaw, Poland

11:45am CEST

The Next Generation of Immersive Capture and Reproduction: Sessions from McGill University’s Virtual Acoustic Laboratory
Saturday May 24, 2025 11:45am - 12:45pm CEST
In this workshop, we present the next generation of Immersive audio capture and reproduction through virtual acoustics. The aural room, whether real or generated, brings together the listener and the sound source in a way that fulfills both the listener’s perceptual needs—like increasing the impression of orientation, presence, and envelopment—and creates aesthetic experiences by elaborating on the timbre and phrasing of the music.
Members of the Immersive Audio Lab (IMLAB) at McGill University will discuss recent forays in creating and capturing aural spaces, using technology ranging from virtual acoustics to Higher Order Ambisonics (HOA) microphones. Descriptions of capture methods, including microphone techniques and experiments will be accompanied by 7.1.4 audio playback demos.
From our studio sessions, we will showcase updates to our Virtual Acoustics Technology (VAT) system, which uses active acoustics in conjunction with 15 omnidirectional and 32 bidirectional speakers to transport musicians into simulated environments. Workshop elements will include a new methodology for creating dynamically changing interactive environments for musicians and listeners, ways to create focus and “mix” sound sources within the virtual room, experimental capture techniques for active acoustic environments, and real-time electronics spatialization in the tracking room via the VAT system.
On location, lab members have been experimenting with hybridized HOA capture systems for large-scale musical scenes. We will showcase multi-point HOA recording techniques to best capture direct sound and room reverberance, and excerpts that compare HOA to traditional channel-based capture systems.
Speakers
avatar for Kathleen Zhang

Kathleen Zhang

McGill University
AA

Aybar Aydin

PhD Candidate, McGill University
avatar for Michail Oikonomidis

Michail Oikonomidis

Doctoral student, McGill University
Michael Ikonomidis (Michail Oikonomidis) is an accomplished audio engineer and PhD student in Sound Recording at McGill University, specializing in immersive audio, high-channel count orchestral recordings and scoring sessions.With a diverse background in music production, live sound... Read More →
avatar for Richard King

Richard King

Professor, McGill University
Richard King is an Educator, Researcher, and a Grammy Award winning recording engineer. Richard has garnered Grammy Awards in various fields including Best Engineered Album in both the Classical and Non-Classical categories. Richard is an Associate Professor at the Schulich School... Read More →
Saturday May 24, 2025 11:45am - 12:45pm CEST
C4 ATM Studio Warsaw, Poland

12:15pm CEST

Workshop: How to Build a World-Class Brand in 24 Hours
Saturday May 24, 2025 12:15pm - 1:15pm CEST
In this dynamic, hackathon-style session, participants will rapidly develop a world-class brand strategy for their company using cutting-edge AI tools and collaborative exercises. Attendees will leave with an actionable blueprint they can implement immediately in their businesses or projects.

Format: 90 minute session
Key Takeaways:
Master the essentials of brand strategy and its impact on content creation and sales
Engage in hands-on exercises to develop a brand strategy in real time
Learn how AI tools can accelerate brand positioning
Speakers
Saturday May 24, 2025 12:15pm - 1:15pm CEST
C1 ATM Studio Warsaw, Poland

12:15pm CEST

Simulated Free-field Measurements
Saturday May 24, 2025 12:15pm - 1:45pm CEST
Time selective techniques that enable measurements of the free field response of a loudspeaker to be performed without the need for an anechoic chamber are presented. The low frequency resolution dependent room size limitations of both time selective measurements and anechoic chambers are discussed. Techniques combining signal processing and appropriate test methods are presented enabling measurements of the complex free field response of a loudspeaker to be performed throughout the entire audio frequency range without an anechoic chamber. Measurement technique for both nar field and time selective far field measurements are detailed. The results in both the time and frequency domain are available and ancilliary functions derived from these results are easily calculated automatically. A review of the current state of the art is also presented.
Saturday May 24, 2025 12:15pm - 1:45pm CEST
C2 ATM Studio Warsaw, Poland

12:30pm CEST

What was it about the Dolby Noise Reduction System that made it successful?
Saturday May 24, 2025 12:30pm - 1:30pm CEST
Warsaw tutorial

Love it or hate it the Dolby noise reduction system had a significant impact on sound recording practice. Even nowadays, in our digital audio workstation world, Dolby noise reduction units are used as effects processors. 2
However, when the system first came out in the 1960s, there were other noise reduction systems, but the Dolby “Model A” noise reduction system, and its successors, still became dominant. What was it about the Dolby system that made it so successful?
This tutorial will look in some detail into the inner workings of the Dolby A Noise reduction system to see how this came about.
Dolby made some key technical decisions in his design, that worked with the technology of the day, to provide noise reduction that did minimal harm to the audio signal and tried to minimise any audible effects of the noise reduction processing. We will examine these key decisions and show how the fitted with the technology and electronic components at the time.
The tutorial will start with a basic introduction to complementary noise reduction systems and their pros and cons. We will the go on to examine the Dolby system in more detail, including looking at some of the circuitry.
In particular, we will discuss:
1. The principle of least treatment.
2. Side chain processing.
3. Psychoacoustic elements.
4. What Dolby could have done better.
Although the talk will concentrate on the Model 301 processor, if time permits, we will look at the differences between it, and the later Cat 22 version.
The tutorial will be accessible to everyone, you will not have to be an electronic engineer to understand the principles behind this seminal piece of audio engineering history.
Speakers
avatar for Jamie Angus-Whiteoak

Jamie Angus-Whiteoak

Emeritus Professor/Consultant, University of Salford/JASA Consultancy
Jamie Angus-Whiteoak is Emeritus Professor of Audio Technology at Salford University. Her interest in audio was crystallized at age 11 when she visited the WOR studios in NYC on a school trip in 1967. After this she was hooked, and spent much of her free time studying audio, radio... Read More →
Saturday May 24, 2025 12:30pm - 1:30pm CEST
C3 ATM Studio Warsaw, Poland

1:00pm CEST

Immersive Listening
Saturday May 24, 2025 1:00pm - 2:45pm CEST
Saturday May 24, 2025 1:00pm - 2:45pm CEST
C4 ATM Studio Warsaw, Poland

1:30pm CEST

Key Technology Briefing 5
Saturday May 24, 2025 1:30pm - 2:45pm CEST
Saturday May 24, 2025 1:30pm - 2:45pm CEST
C1 ATM Studio Warsaw, Poland

1:45pm CEST

Be A Leader!
Saturday May 24, 2025 1:45pm - 2:45pm CEST
Have you ever wondered how AES works? Let's meet up and talk about the benefits of volunteering and the path to leadership in AES! You could be our next Chair, Vice President, or even AES President!
Speakers
avatar for Leslie Gaston-Bird

Leslie Gaston-Bird

President, Audio Engineering Society
Dr. Leslie Gaston-Bird (AMPS, MPSE) is President of the Audio Engineering Society and author of the books "Women in Audio", part of the AES Presents series and published by Focal Press (Routledge); and Math for Audio Majors (A-R Editions). She is a voting member of the Recording Academy... Read More →
Saturday May 24, 2025 1:45pm - 2:45pm CEST
Hall F ATM Studio Warsaw, Poland

1:45pm CEST

A century of dynamic loudspeakers
Saturday May 24, 2025 1:45pm - 2:45pm CEST
This tutorial is based on a Journal of the Audio Engineering Society review paper being submitted.

2025 marks the centennial of the commercial introduction of the modern dynamic direct radiating loudspeaker, Radiola 104, and the publication of Kellogg and Rice’s paper describing its design. The tutorial outlines the developments leading to the first dynamic loudspeakers and their subsequent evolution. The presentation focuses on direct radiating loudspeakers, although the parallel development of horn technology is acknowledged.

The roots of the dynamic loudspeaker trace back to the moving coil linear actuator patented by Werner Siemens in 1877. The first audio-related application was Sir Joseph Lodge’s 1896 mechanical telephone signal amplifier, or “repeater.” The first moving coil loudspeaker was the Magnavox by Peter Jensen in 1915, but the diaphragm assembly resembled earlier electromagnetic loudspeakers. The Blatthaller loudspeakers by Schottky and Gerlach in 1920’s are another example of a different early use of the dynamic concept.

It is interesting to take a look at the success factors of the dynamic loudspeakers, creating a market for quality sound reproduction and practically replacing the earlier electromagnetic designs by the end of 1920s. The first dynamic loudspeakers were heavy, expensive, and inefficient, but the sound quality could not be matched by any other technology available then. The direct radiating dynamic loudspeaker is also one of the most scalable technologies in engineering, both in terms of size and production volume. The dynamic loudspeaker is also quite friendly in terms of operating voltage and current, and what is important, the sound can be adjusted through enclosure design.

The breadth of the applications of dynamic loudspeakers would not have been possible without the developments in magnet materials. Early dynamic loudspeakers used electromagnets for air gap flux, requiring constant high power (e.g., Radiola 104’s field coil consumed 8W, while peak audio power was about 1W). Some manufacturers attempted steel permanent magnets, but these were bulky. A major breakthrough came with AlNiCo (Aluminum-Nickel-Cobalt) magnets, first developed in Japan in the 1930s and commercialized in the U.S. during World War II. AlNiCo enabled smaller, lighter, and more efficient designs. However, a cobalt supply crisis in 1970 led to the widespread adoption of ferrite (ceramic) magnets, which were heavier but cost-effective. The next advancement especially in small drivers were rare earth magnets introduced in the early 1980s. However, a neodymium supply crisis in the 2000s led to a partial return to ferrite magnets.

One of the focus points of the industry’s attention has been the cone and surround materials for the loudspeaker. Already the first units employed relatively lossy cardboard type material. Although plastic and foam materials were attempted in loudspeakers from 1950’s onwards, plastic cones for larger loudspeakers were successfully launched only in the late 1970’s. Metal cones, honeycomb diaphragms, and use of coatings to improve the stiffness have all brought more variety to the loudspeaker market, enabled by the significant improvement of numerical loudspeaker modelling and measurement methods, also starting their practical use during 1970’s.

A detail that was somewhat different in the first loudspeakers as compared to modern designs was the centering mechanism. The Radiola centering mechanism was complex, and soon simpler flat supports (giving the name “spider”) were developed. The modern concentrically corrugated centering system was developed in the early 1930’s by Walter Vollman at the German Gravor loudspeaker company, and this design has remained the standard solution with little variation.

The limitations of the high frequency reproduction of the early drivers led to improvements in driver design. The high frequency performance of the cone drivers was improved by introducing lossy or compliant areas that attempted to restrict the radiation of high frequencies to the apex part of the cone, and adding a double cone. The introduction of FM radio and improved records led to the need to develop loudspeakers with more extended treble reproduction. The first separate tweeter units were horn loudspeakers, and the first direct radiating tweeters were scaled down cone drivers, but late 1950’s saw the introduction of modern tweeters where the voice coil was outside the radiating diaphragm.

The latest paradigm shift in dynamic loudspeakers is the microspeaker, ubiquitous in portable devices. By manufacturing numbers, microspeakers are the largest class of dynamic loudspeakers, presenting unique structural, engineering, and manufacturing challenges. Their rapid evolution from the 1980s onwards includes the introduction of rare earth magnets, diaphragm forming improvements, and a departure from the cylindrical form factor of traditional loudspeakers. The next phase in loudspeaker miniaturization is emerging, with the first MEMS-based dynamic microspeakers now entering the market.
Speakers
JB

Juha Backman

AAC Technologies
Saturday May 24, 2025 1:45pm - 2:45pm CEST
C3 ATM Studio Warsaw, Poland

3:00pm CEST

Convention Closing Ceremony
Saturday May 24, 2025 3:00pm - 4:30pm CEST
Saturday May 24, 2025 3:00pm - 4:30pm CEST
Hall F ATM Studio Warsaw, Poland
 


Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.
Filtered by Date - 
  • Acoustic Transducers & Measurements
  • Acoustics
  • Acoustics of large performance or rehearsal spaces
  • Acoustics of smaller rooms
  • Acoustics of smaller rooms Room acoustic solutions and materials
  • Acoustics & Sig. Processing
  • AI
  • AI & Machine Audition
  • Analysis and synthesis of sound
  • Archiving and restoration
  • Audio and music information retrieval
  • Audio Applications
  • Audio coding and compression
  • Audio effects
  • Audio Effects & Signal Processing
  • Audio for mobile and handheld devices
  • Audio for virtual/augmented reality environments
  • Audio formats
  • Audio in Education
  • Audio perception
  • Audio quality
  • Auditory display and sonification
  • Automotive Audio
  • Automotive Audio & Perception
  • Digital broadcasting
  • Electronic dance music
  • Electronic instrument design & applications
  • Evaluation of spatial audio
  • Forensic audio
  • Game Audio
  • Generative AI for speech and audio
  • Hearing Loss Protection and Enhancement
  • High resolution audio
  • Hip-Hop/R&B
  • Impact of room acoustics on immersive audio
  • Instrumentation and measurement
  • Interaction of transducers and the room
  • Interactive sound
  • Listening tests and evaluation
  • Live event and stage audio
  • Loudspeakers and headphones
  • Machine Audition
  • Microphones converters and amplifiers
  • Microphones converters and amplifiers Mixing remixing and mastering
  • Mixing remixing and mastering
  • Multichannel and spatial audio
  • Music and speech signal processing
  • Musical instrument design
  • Networked Internet and remote audio
  • New audio interfaces
  • Perception & Listening Tests
  • Protocols and data formats
  • Psychoacoustics
  • Room acoustics and perception
  • Sound design and reinforcement
  • Sound design/acoustic simulation of immersive audio environments
  • Spatial Audio
  • Spatial audio applications
  • Speech intelligibility
  • Studio recording techniques
  • Transducers & Measurements
  • Wireless and wearable audio