A new one-third-octave-band noise criteria (NC) rating method is presented. one-third-octave-band NC curves from NC 70 to NC 0 are derived from the existing octave-band curves, adjusted for bandwidth, fit to continuous functions, and redistributed progressively over this space. This synthesis is described in detail. The diffuse field hearing threshold at low frequencies is also derived. Several NC curves at high frequencies are shown to be below threshold (inaudible). NC ratings are calculated using both the new one-third-octave-band and the legacy octave-band methods for a number of different room noise spectra. The resulting values were found to be similar for both methods. NC ratings using the new method are particularly applicable to very low noise level critical listening environments such as recording studios, scoring stages, and cinema screening rooms, but are shown to also be applicable to higher noise level environments. The proposed method better tracks the audibility of noise at low levels as well as the audibility of tonal noise components, while the legacy method as originally conceived generally emphasizes speech interference.
This paper introduces a new algorithm for multiposition mixed-phase equalization of slot-loaded loudspeaker responses obtained in the horizontal and vertical plane, using finite impulse response (FIR) filters. The algorithm selects a {\em prototype response} that yields a filter that best optimizes a time-domain-based objective metric for equalization for a given direction. The objective metric includes a weighted linear combination of pre-ring energy, early and late reflection energy, and decay rate (characterizing impulse response shortening) during filter synthesis. The results show that the presented mixed-phase multiposition filtering algorithm performs a good equalization along all horizontal directions and for most positions in the vertical direction. Beyond the multiposition filtering capabilities, the algorithm and the metric are suitable for designing mixed-phase filters with low delays, an essential constraint for real-time processing.
This paper describes the design of an Analog Stomp Box capable of reproducing the effect observed when a loudspeaker is rotated during operation, the so-called Leslie effect. When the loudspeaker is rotating two physical effects can be observed: The first is a variation of the amplitude because sometimes the speaker is aimed at the observer and then, after 180 degrees of rotation, the loudspeaker is aimed opposing to the observer. To recreate this variation in amplitude, a circuit called Tremolo was designed to achieve this effect. The second is the Doppler effect, which was obtained with a circuit designed to vary the phase of the signal (Vibrato). The phase variation simulates a frequency variation for the ears. Assembling these two circuits in cascade, it is obtained the Pseudo Leslie Effect. These Vibrato and Tremolo circuits receive the control signal from a Low Frequency Oscillator (LFO) which controls the effect frequency. To get a high degree of repeatability, which is not simple in analog circuits employing photocouplers, those photocoupler devices were replaced with VCAs. The photocouplers have a great variation of your optical characteristics, so it is hard to obtain the same result in a large-scale production. However, using VCAs it turns to be easily achievable. The THAT2180 IC is a VCCS, Voltage-Controlled Current Source with an exponential gain control and low signal distortion. The term Pseudo was used because, in the Leslie Effect, the rotation of the loudspeaker gives a lag of 90o between the frequency and amplitude variations. This lag has not been implemented, but the sonic result left nothing to be desired.
In today's music industry and among musicians, instead of using analog hardware effects to alter sound, digital counterparts are increasingly being used, often in the form of software plugins. The circuits of musical devices often contain nonlinear components (diodes, vacuum tubes, etc.), which complicates their digital modeling. One of the approaches to address this is the use of state-space methods, such as the Euler or Runge-Kutta methods. To guarantee stability, implicit state-space methods should be used; however, they require the numerical solution of an implicit equation, leading to large computational complexity. Alternatively, the K-method can be used that avoids the need of numerical methods if the system meets certain conditions, thus significantly decreasing the computational complexity. Although the K-method has been invented almost three decades ago, the authors are not aware of a thorough computational complexity analysis of the method in comparison to the more common implicit state-space approaches, such as the backward Euler method. This paper introduces these two methods, explores their advantages, and compares their computational load as a function of model size by using a scalable circuit example.
Modeling or compensating a given transfer function is a common task in the field of audio. To comply with the characteristics of hearing, logarithmic frequency resolution filters have been developed, including the Kautz filter, which has orthogonal tap outputs. When the system to be modeled is time-varying, the modeling filter should be tuned to follow the changes in the transfer function. The Least Mean Squares (LMS) and Recursive Least Squares (RLS) algorithms are well-known methods for adaptive filtering, where the latter has faster convergence rate with lower remaining error, at the expense of high computational demand. In this paper we propose a simplification to the RLS algorithm, which builds on the orthogonality of the tap outputs of Kautz filters, resulting in a significant reduction in computational complexity.
Without relying on audio data as a reference, artificial reverberation models often struggle to accurately simulate the acoustics of real rooms. To address this, we propose a hybrid reverberator derived from a room’s physical properties. Room geometry is extracted via Light Detection and Ranging mapping, enabling the calculation of acoustic reflection paths via the Image Source Method. Frequency-dependent absorption is found by classifying room surface materials with a multi-modal Large Language Model and referencing a database of absorption coefficients. The extracted information is used to parametrise a hybrid reverberator, divided into two components: early reflections, using a tapped delay line, and late reverberation, using a Scattering Feedback Delay Network. Our listening test results show that participants often rate the proposed system as the most natural simulation of a small hallway room. Additionally, we compare the reverberation metrics of the hybrid reverberator and similar state-of-the-art models to those of the small hallway.
Josh Reiss is Professor of Audio Engineering with the Centre for Digital Music at Queen Mary University of London. He has published more than 200 scientific papers (including over 50 in premier journals and 6 best paper awards) and co-authored two books. His research has been featured... Read More →
Saturday May 24, 2025 11:00am - 11:20am CEST C2ATM Studio Warsaw, Poland
Various AD conversion methods exist, and high-speed 1 bit method have been proposed with using a high sampling frequency and 1 bit quantization. The ΔΣ modulation is mainly used, and due to its characteristic, these signals are able to accurately preserve the spectrum of the analog signal and move quantization noise into higher frequency bands, which allows for a high signal-to-noise ratio in the audible range. However, When performing signal processing tasks such as addition and multiplication on high-speed 1 bit signals, it is generally necessary to convert them into multi-bit signals for arithmetic operations. In this paper, we propose a direct processing method for high-speed 1 bit signal without converting them into multi-bit signal and the convolution is realized. In this method, 1 bit data are reordered to achieve operations without arithmetic one. The proposed method was verified through the simulations with using low-pass FIR filters. Frequency-domain analysis showed that the proposed method achieved equivalent performance to conventional multi-bit convolutions with successfully performing the desired filtering. In this paper, we present a novel approach to directly processing high-speed 1 bit signals and suggest potential applications in audio and signal processing fields.
Speech denoising is a prominent and widely utilized task, appearing in many common use-cases. Although there are very powerful published machine learning methods, most of those are too complex for deployment in everyday and/or low resources computational environments, like hand-held devices, smart glasses, hearing aids, automotive platforms, etc. Knowledge distillation (KD) is a prominent way for alleviating this complexity mismatch, by transferring the learned knowledge from a pre-trained complex model, the teacher, to another less complex one, the student. KD is implemented by using minimization criteria (e.g. loss functions) between learned information of the teacher and the corresponding one from the student. Existing KD methods for speech denoising hamper the KD by bounding the learning of the student to the distribution learned by the teacher. Our work focuses on a method that tries to alleviate this issue, by exploiting properties of the cosine similarity used as the KD loss function. We use a publicly available dataset, a typical architecture for speech denoising (e.g. UNet) that is tuned for low resources environments and conduct repeated experiments with different architectural variations between the teacher and the student, reporting mean and standard deviation of metrics of our method and another, state-of-the-art method that is used as a baseline. Our results show that with our method we can make smaller speech denoising models, capable to be deployed into small devices/embedded systems, to perform better compared to when typically trained and when using other KD methods.
Time selective techniques that enable measurements of the free field response of a loudspeaker to be performed without the need for an anechoic chamber are presented. The low frequency resolution dependent room size limitations of both time selective measurements and anechoic chambers are discussed. Techniques combining signal processing and appropriate test methods are presented enabling measurements of the complex free field response of a loudspeaker to be performed throughout the entire audio frequency range without an anechoic chamber. Measurement technique for both nar field and time selective far field measurements are detailed. The results in both the time and frequency domain are available and ancilliary functions derived from these results are easily calculated automatically. A review of the current state of the art is also presented.