diart.models#

Module Contents#

Classes#

PowersetAdapter

Base class for all neural network modules.

PyannoteLoader

ONNXLoader

ONNXModel

LazyModel

Helper class that provides a standard way to create an ABC using

SegmentationModel

Minimal interface for a segmentation model.

EmbeddingModel

Minimal interface for an embedding model.

Attributes#

diart.models.IS_PYANNOTE_AVAILABLE = True#
diart.models.IS_ONNX_AVAILABLE = True#
class diart.models.PowersetAdapter(segmentation_model)#

Bases: torch.nn.Module

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes:

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self) -> None:
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will also have their parameters converted when you call to(), etc.

Note

As per the example above, an __init__() call to the parent class must be made before assignment on the child.

Variables:

training (bool) – Boolean represents whether this module is in training or evaluation mode.

Parameters:

segmentation_model (torch.nn.Module) –

forward(waveform)#
Parameters:

waveform (torch.Tensor) –

Return type:

torch.Tensor

class diart.models.PyannoteLoader(model_info, hf_token=True)#
Parameters:

hf_token (Union[Text, bool, None]) –

__call__()#
Return type:

Callable

class diart.models.ONNXLoader(path, input_names, output_name)#
Parameters:
  • path (str | pathlib.Path) –

  • input_names (List[str]) –

  • output_name (str) –

__call__()#
Return type:

ONNXModel

class diart.models.ONNXModel(path, input_names, output_name)#
Parameters:
  • path (pathlib.Path) –

  • input_names (List[str]) –

  • output_name (str) –

property execution_provider: str#
Return type:

str

recreate_session()#
to(device)#
Parameters:

device (torch.device) –

Return type:

ONNXModel

__call__(*args)#
Return type:

torch.Tensor

class diart.models.LazyModel(loader)#

Bases: abc.ABC

Helper class that provides a standard way to create an ABC using inheritance.

Parameters:

loader (Callable[[], Callable]) –

is_in_memory()#

Return whether the model has been loaded into memory

Return type:

bool

load()#
to(device)#
Parameters:

device (torch.device) –

Return type:

LazyModel

__call__(*args, **kwargs)#
eval()#
Return type:

LazyModel

class diart.models.SegmentationModel(loader)#

Bases: LazyModel

Minimal interface for a segmentation model.

Parameters:

loader (Callable[[], Callable]) –

static from_pyannote(model, use_hf_token=True)#

Returns a SegmentationModel wrapping a pyannote model.

Parameters:
  • model (pyannote.PipelineModel) – The pyannote.audio model to fetch.

  • use_hf_token (str | bool, optional) – The Huggingface access token to use when downloading the model. If True, use huggingface-cli login token. Defaults to None.

Returns:

wrapper

Return type:

SegmentationModel

static from_onnx(model_path, input_name='waveform', output_name='segmentation')#
Parameters:
  • model_path (Union[str, pathlib.Path]) –

  • input_name (str) –

  • output_name (str) –

Return type:

SegmentationModel

static from_pretrained(model, use_hf_token=True)#
Parameters:

use_hf_token (Union[Text, bool, None]) –

Return type:

SegmentationModel

__call__(waveform)#

Call the forward pass of the segmentation model. :param waveform: :type waveform: torch.Tensor, shape (batch, channels, samples)

Returns:

speaker_segmentation

Return type:

torch.Tensor, shape (batch, frames, speakers)

Parameters:

waveform (torch.Tensor) –

class diart.models.EmbeddingModel(loader)#

Bases: LazyModel

Minimal interface for an embedding model.

Parameters:

loader (Callable[[], Callable]) –

static from_pyannote(model, use_hf_token=True)#

Returns an EmbeddingModel wrapping a pyannote model.

Parameters:
  • model (pyannote.PipelineModel) – The pyannote.audio model to fetch.

  • use_hf_token (str | bool, optional) – The Huggingface access token to use when downloading the model. If True, use huggingface-cli login token. Defaults to None.

Returns:

wrapper

Return type:

EmbeddingModel

static from_onnx(model_path, input_names=None, output_name='embedding')#
Parameters:
  • model_path (Union[str, pathlib.Path]) –

  • input_names (List[str] | None) –

  • output_name (str) –

Return type:

EmbeddingModel

static from_pretrained(model, use_hf_token=True)#
Parameters:

use_hf_token (Union[Text, bool, None]) –

Return type:

EmbeddingModel

__call__(waveform, weights=None)#

Call the forward pass of an embedding model with optional weights. :param waveform: :type waveform: torch.Tensor, shape (batch, channels, samples) :param weights: Temporal weights for each sample in the batch. Defaults to no weights. :type weights: Optional[torch.Tensor], shape (batch, frames)

Returns:

speaker_embeddings

Return type:

torch.Tensor, shape (batch, embedding_dim)

Parameters:
  • waveform (torch.Tensor) –

  • weights (Optional[torch.Tensor]) –