import pandas as pd
import mir_eval
import librosa
import matplotlib.pyplot as plt
import numpy as np
import IPython.display as ipd
import os
import glob
We use the Late/Deep model from [1] to predict the F0 of each singer from vocal quartets singing Tebe Poem.
[1] Helena Cuesta, Brian McFee and Emilia Gómez. Multiple F0 Estimation in Vocal Ensembles using Convolutional Neural Networks. In Proceedings of the International Society for Music Information Retrieval (ISMIR) Conference, 2020. Montreal, Canada (virtual), pp. 302-309.
We select six performances of Tebe Poem from the DCS. First, four takes from the Full Choir setting, from which we select four singers (SATB). Then, two takes from the Quartet A configuration, also four singers (SATB).
Let's first listen to some of the audio files:
audio_path = './audio'
# full choir example
x, _ = librosa.core.load(os.path.join(audio_path, 'DCS_TP_FullChoir_Take02_mix.wav'), sr=22050.0)
print("Audio file: DCS_TP_FullChoir_Take02. Mixture of 4 DYN microphones.")
ipd.display(ipd.Audio(x, rate=22050))
# quartet A example
x, _ = librosa.core.load(os.path.join(audio_path, 'DCS_TP_QuartetA_Take02_mix.wav'), sr=22050.0)
print("Audio file: DCS_TP_QuartetA_Take02_mix. Stereo mic recording..")
ipd.display(ipd.Audio(x, rate=22050))