Modal Analysis of Room Impulse Responses Using Subband ESPRIT

This paper describes a modification of the ESPRIT algorithm which can be used to determine the parameters (frequency, decay time, initial magnitude and initial phase) of a modal reverberator that best match a provided room impulse response. By applying perceptual criteria we are able to match room impulse responses using a variable number of modes, with an emphasis on high quality for lower mode counts; this allows the synthesis algorithm to scale to different computational environments. A hybrid FIR/modal reverb architecture is also presented which allows for the efficient modeling of room impulse responses that contain sparse early reflections and dense late reverb. MUSHRA tests comparing the analysis/synthesis using various mode numbers for our algorithms, and for another state of the art algorithm, are included as well.

Full Paper

Authors

Corey Kereliuk - Reverberate.ca

Russell Wedelich - Eventide Inc.

Woody Herman - Eventide Inc.

Daniel J. Gillespie - Newfangled Audio



Listening Test Audio Files

Below are the audio files used in all of the listening tests presented in the paper. Each listening test was conducted with the free, and open source, webMUSHRA tool available at: https://www.audiolabs-erlangen.de/resources/webMUSHRA. The tests were conducted similar to the MUSHRA standard, although for all of the tests except the last one we chose to forgo the traditional "anchors" (the refernce signal low pass filtered with a cutoff frequency of 3.5kHz and 5kHz). Instead we chose to treat the synthesized impulse responses with lower mode counts (typically 400 and 800 or 500 and 1000) as a sort of psuedo anchor.


Listening Test 1

In this first test, we compare the results obtained by Maestre et al in their paper from the 2017 DAFx conferece: Constrained Pole Optimization for Modal Reverberation with the results of Subband ESPRIT analysis. The audio files for which are available at https://ccrma.stanford.edu/~esteban/modrev/dafx2017/. The results of this test are shown in the paper in fig. 3.

Target --->

Maestre Et Al M=400 --->

Subband ESPRIT M=400 --->

Maestre Et Al M=800 --->

Subband ESPRIT M=800 --->

Maestre Et Al M=1800 --->

Subband ESPRIT M=1800 --->


Listening Test 2

In this test, we compare the results of synthesizing the impulse response with a purely Modal reprsentation and a hybrid FIR/Modal representation. In the hybrid method, the first N samples of the impulse response, representing the early reflections, are generated via standard convolution, while the following samples, the late field, are generated via the modal synthesis. The results of this test are shown in the paper in fig. 4. A WARNING FOR YOUR EARS! - these are louder than test 1.

Target --->

Modal Only M=500 --->

FIR+Modal M=500 --->

Modal Only M=1500 --->

FIR+Modal M=1500 --->

Modal Only M=2500 --->

FIR+Modal M=2500 --->

Modal Only M=12000 --->


Listening Test 3

This is also a comparison between the Subband ESPRIT method and the work of Meastre et al, although this time the same impulse responses used in the first test are convolved with source material, in this case the human voice.

Target --->

Maestre Et Al M=400 --->

Subband ESPRIT M=400 --->

Maestre Et Al M=800 --->

Subband ESPRIT M=800 --->

Maestre Et Al M=1800 --->

Subband ESPRIT M=1800 --->


Listening Test 4

Another test comparing the FIR vs no FIR method with a different impulse response, this time convolved with a source signal.

Target --->

Modal Only M=500 --->

FIR+Modal M=500 --->

Modal Only M=1000 --->

FIR+Modal M=1000 --->

Modal Only M=3000 --->

FIR+Modal M=3000 --->