In [1]:
import pandas as pd
import mir_eval
import librosa
import matplotlib.pyplot as plt
import numpy as np

import IPython.display as ipd

import os
import glob

Multiple F0 estimation

We use the Late/Deep model from [1] to predict the F0 of each singer from vocal quartets singing Tebe Poem.

[1] Helena Cuesta, Brian McFee and Emilia Gómez. Multiple F0 Estimation in Vocal Ensembles using Convolutional Neural Networks. In Proceedings of the International Society for Music Information Retrieval (ISMIR) Conference, 2020. Montreal, Canada (virtual), pp. 302-309.

Data subset

We select six performances of Tebe Poem from the DCS. First, four takes from the Full Choir setting, from which we select four singers (SATB). Then, two takes from the Quartet A configuration, also four singers (SATB).

Let's first listen to some of the audio files:

In [3]:
audio_path = './audio'

# full choir example
x, _ = librosa.core.load(os.path.join(audio_path, 'DCS_TP_FullChoir_Take02_mix.wav'), sr=22050.0)
print("Audio file: DCS_TP_FullChoir_Take02. Mixture of 4 DYN microphones.")
ipd.display(ipd.Audio(x, rate=22050))

# quartet A example
x, _ = librosa.core.load(os.path.join(audio_path, 'DCS_TP_QuartetA_Take02_mix.wav'), sr=22050.0)
print("Audio file: DCS_TP_QuartetA_Take02_mix. Stereo mic recording..")
ipd.display(ipd.Audio(x, rate=22050))
Audio file: DCS_TP_FullChoir_Take02. Mixture of 4 DYN microphones.