diart.blocks.utils#
Module Contents#
Classes#
Transform a speaker segmentation from the discrete-time domain |
|
Dynamically resample audio chunks. |
|
Change the volume of an audio chunk. |
- class diart.blocks.utils.Binarize(threshold, uri=None)#
Transform a speaker segmentation from the discrete-time domain into a continuous-time speaker segmentation.
- Parameters:
threshold (float) – Probability threshold to determine if a speaker is active at a given frame.
uri (Optional[Text]) – Uri of the audio stream. Defaults to no uri.
- __call__(segmentation)#
Return the continuous-time segmentation corresponding to the discrete-time input segmentation.
- Parameters:
segmentation (SlidingWindowFeature) – Discrete-time speaker segmentation.
- Returns:
annotation – Continuous-time speaker segmentation.
- Return type:
Annotation
- class diart.blocks.utils.Resample(sample_rate, resample_rate, device=None)#
Dynamically resample audio chunks.
- Parameters:
sample_rate (int) – Original sample rate of the input audio
resample_rate (int) – Sample rate of the output
device (Optional[torch.device]) –
- __call__(waveform)#
- Parameters:
waveform (diart.features.TemporalFeatures) –
- Return type:
diart.features.TemporalFeatures
- class diart.blocks.utils.AdjustVolume(volume_in_db)#
Change the volume of an audio chunk.
Notice that the output volume might be different to avoid saturation.
- Parameters:
volume_in_db (float) – Target volume in dB.
- static get_volumes(waveforms)#
Compute the volumes of a set of audio chunks.
- Parameters:
waveforms (torch.Tensor) – Audio chunks. Shape (batch, samples, channels).
- Returns:
volumes – Audio chunk volumes per channel. Shape (batch, 1, channels)
- Return type:
torch.Tensor
- __call__(waveform)#
- Parameters:
waveform (diart.features.TemporalFeatures) –
- Return type:
diart.features.TemporalFeatures