diart.blocks.utils#

Module Contents#

Classes#

Binarize

Transform a speaker segmentation from the discrete-time domain

Resample

Dynamically resample audio chunks.

AdjustVolume

Change the volume of an audio chunk.

class diart.blocks.utils.Binarize(threshold, uri=None)#

Transform a speaker segmentation from the discrete-time domain into a continuous-time speaker segmentation.

Parameters:
  • threshold (float) – Probability threshold to determine if a speaker is active at a given frame.

  • uri (Optional[Text]) – Uri of the audio stream. Defaults to no uri.

__call__(segmentation)#

Return the continuous-time segmentation corresponding to the discrete-time input segmentation.

Parameters:

segmentation (SlidingWindowFeature) – Discrete-time speaker segmentation.

Returns:

annotation – Continuous-time speaker segmentation.

Return type:

Annotation

class diart.blocks.utils.Resample(sample_rate, resample_rate, device=None)#

Dynamically resample audio chunks.

Parameters:
  • sample_rate (int) – Original sample rate of the input audio

  • resample_rate (int) – Sample rate of the output

  • device (Optional[torch.device]) –

__call__(waveform)#
Parameters:

waveform (diart.features.TemporalFeatures) –

Return type:

diart.features.TemporalFeatures

class diart.blocks.utils.AdjustVolume(volume_in_db)#

Change the volume of an audio chunk.

Notice that the output volume might be different to avoid saturation.

Parameters:

volume_in_db (float) – Target volume in dB.

static get_volumes(waveforms)#

Compute the volumes of a set of audio chunks.

Parameters:

waveforms (torch.Tensor) – Audio chunks. Shape (batch, samples, channels).

Returns:

volumes – Audio chunk volumes per channel. Shape (batch, 1, channels)

Return type:

torch.Tensor

__call__(waveform)#
Parameters:

waveform (diart.features.TemporalFeatures) –

Return type:

diart.features.TemporalFeatures