Research

My main research focus is on the automatic analysis of pitch description of polyphonic singing audio recordings, i.e., vocal ensembles and choirs.

In our work we try to answer the following research questions:

Can we extract the frequencies sung by all singers of an ensemble?
Can we assign each extracted frequency to the corresponding singer?
How can we model unison performances, where the melodic and linguistic content is shared among several singers?
Is it possible to assess the performance of a singer of an ensemble using intonation descriptors?

To answer these questions, we take data-driven approaches, combining machine learning and signal processing techniques for tasks such as multiple F0 estimation, voice assignment, unison singing analysis, and automatic score-informed singing assessment, among others. In the Publications section of this website, you will find publications linked to each of these tasks.

In the Demos section we present some audio examples and analysis results from our multiple F0 estimation models.

As part of the PhD thesis, we worked on two publicly-available datasets of polyphonic singing voice: the Choral Singing Dataset and the Dagstuhl ChoirSet. See the Datasets & Software section for more details!

This PhD thesis is partially developed in the scope of the TROMPA (Towards Richer Online Music Public-domain Archives) project, funded by the EU’s H2020 program.

Datasets

Choral Singing Dataset (CSD)

Multi-track dataset of choral music, recorded in collaboration with the Cor Anton Bruckner from Barcelona. 16 singers (4 sopranos, 4 altos, 4 tenors, 4 basses) recorded with individual microphones singing together. Each section was recorded separately, but in sync with the others thanks to a reference backing track and a video recording of the conductor of the choir.

This dataset contains F0 annotations (automatically extracted and manually corrected) for individual microphone, note annotations, and sync MIDI

This dataset was released together with the following paper:

Helena Cuesta, Emilia Gómez, Agustín Martorell and Felipe Loáiciga. “Analysis of Intonation in Unison Choir Singing”. In Proceedings of the 15th International Conference on Music Perception and Cognition / 10th Triennial Conference of the European Society for the Cognitive Sciences of Music. Graz (Austria), July, 2018.

Dagstuhl ChoirSet (DCS)

Multi-track dataset of choral music, recorded during a Dagstuhl Seminar on “Computational Methods for Melody and Voice Processing in Music Recordings”. This dataset is a joint project with the International Audio Laboratories Erlangen (AudioLabs), Germany.

Following a similar configuration to the CSD, the Dagsthul ChoirSet consists of recordings of a choir of 13 singers, captured using a different microphones: larynx, headset and dynamic microphones for individual singers, and a stereo pair to capture the full choir.

This dataset contains F0 annotations (automatically extracted using two different methods) for each individual microphone, beat annotations, and sync score representations.

The dataset was released together with the following paper:

Sebastian Rosenzweig, Helena Cuesta, Christof Weiß, Frank Scherbaum, Emilia Gómez, and Meinard Müller. “Dagstuhl ChoirSet: A Multitrack Dataset for MIR Research on Choral Singing.” In Transactions of the International Society for Music Information Retrieval, 3(1), pp. 98–110, 2020.

Software

Python repository for multiple F0 estimation in vocal ensembles: https://github.com/helenacuesta/multif0-estimation-polyvocals

DCS ToolBox: Python toolbox for the Dagstuhl ChoirSet: https://github.com/helenacuesta/DCStoolbox

Plain Academic