nnsvs.base

class nnsvs.base.PredictionType(value)[source]

Prediction types

DETERMINISTIC = 1

Deterministic prediction

Non-MDN single-stream models should use this type.

Pseudo code:

# training
y = model(x)
# inference
y = model.inference(x)
PROBABILISTIC = 2

Probabilistic prediction with mixture density networks

MDN-based models should use this type.

Pseudo code:

# training
mdn_params = model(x)
# inference
mu, sigma = model.inference(x)
MULTISTREAM_HYBRID = 3

Multi-stream preodictions where each prediction can be detereministic or probabilistic

Multi-stream models should use this type.

Pseudo code:

# training
feature_streams = model(x) # e.g. (mgc, lf0, vuv, bap) or (mel, lf0, vuv)
# inference
y = model.inference(x)

Note that concatenated features are assumed to be returned during inference.

DIFFUSION = 4

Diffusion model’s prediction

NOTE: may subject to change in the future

Pseudo code:

# training
noise, x_recon = model(x)

# inference
y = model.inference(x)

BaseModel

class nnsvs.base.BaseModel(*args, **kwargs)[source]

Base class for all models

If you want to implement your custom model, you should inherit from this class. You must need to implement the forward method. Other methods are optional.

forward(x, lengths=None, y=None)[source]

Forward pass

Parameters:
  • x (tensor) – input features

  • lengths (tensor) – lengths of the input features

  • y (tensor) – optional target features

Returns:

output features

Return type:

tensor

inference(x, lengths=None)[source]

Inference method

If you want to implement custom inference method such as autoregressive sampling, please override this method.

Defaults to call the forward method.

Parameters:
  • x (tensor) – input features

  • lengths (tensor) – lengths of the input features

Returns:

output features

Return type:

tensor

preprocess_target(y)[source]

Preprocess target signals at training time

This is useful for shallow AR models in which a FIR filter is used for the target signals. For other types of model, you don’t need to implement this method.

Defaults to do nothing.

Parameters:

y (tensor) – target features

Returns:

preprocessed target features

Return type:

tensor

prediction_type()[source]

Prediction type.

If your model has a MDN layer, please return PredictionType.PROBABILISTIC.

Returns:

Determisitic or probabilistic. Default is deterministic.

Return type:

PredictionType

is_autoregressive()[source]

Is autoregressive or not

If your custom model is an autoregressive model, please return True. In that case, you would need to implement autoregressive sampling in inference().

Returns:

True if autoregressive. Default is False.

Return type:

bool

has_residual_lf0_prediction()[source]

Whether the model has residual log-F0 prediction or not.

This should only be used for acoustic models.

Returns:

True if the model has residual log-F0 prediction. Default is False.

Return type:

bool