`diart.blocks.utils`#

Module Contents#

`Binarize`	Transform a speaker segmentation from the discrete-time domain
`Resample`	Dynamically resample audio chunks.
`AdjustVolume`	Change the volume of an audio chunk.

class diart.blocks.utils.Binarize(threshold, uri=None)#

Transform a speaker segmentation from the discrete-time domain into a continuous-time speaker segmentation.

Parameters:

threshold (float) – Probability threshold to determine if a speaker is active at a given frame.
uri (Optional[Text]) – Uri of the audio stream. Defaults to no uri.

__call__(segmentation)#

Return the continuous-time segmentation corresponding to the discrete-time input segmentation.

Parameters:: segmentation (SlidingWindowFeature) – Discrete-time speaker segmentation.
Returns:: annotation – Continuous-time speaker segmentation.
Return type:: Annotation

class diart.blocks.utils.Resample(sample_rate, resample_rate, device=None)#

Dynamically resample audio chunks.

Parameters:

__call__(waveform)#

class diart.blocks.utils.AdjustVolume(volume_in_db)#

Change the volume of an audio chunk.

Notice that the output volume might be different to avoid saturation.

static get_volumes(waveforms)#

Compute the volumes of a set of audio chunks.

Parameters:: waveforms (torch.Tensor) – Audio chunks. Shape (batch, samples, channels).
Returns:: volumes – Audio chunk volumes per channel. Shape (batch, 1, channels)
Return type:: torch.Tensor

__call__(waveform)#