nnsvs.gen.predict_acoustic

nnsvs.gen.predict_acoustic(device, labels, acoustic_model, acoustic_config, acoustic_in_scaler, acoustic_out_scaler, binary_dict, numeric_dict, subphone_features='coarse_coding', pitch_indices=None, log_f0_conditioning=True, force_clip_input_features=False, frame_period=5, f0_shift_in_cent=0)[source]

Predict acoustic features from HTS labels

MLPG is applied to the predicted features if the output features have dynamic features.

Parameters:

device (torch.device) – device to use
labels (HTSLabelFile) – HTS labels
acoustic_model (nn.Module) – acoustic model
acoustic_config (AcousticConfig) – acoustic configuration
acoustic_in_scaler (sklearn.preprocessing.StandardScaler) – input scaler
acoustic_out_scaler (sklearn.preprocessing.StandardScaler) – output scaler
binary_dict (dict) – binary feature dictionary
numeric_dict (dict) – numeric feature dictionary
subphone_features (str) – subphone feature type
pitch_indices (list) – indices of pitch features
log_f0_conditioning (bool) – whether to use log f0 conditioning
force_clip_input_features (bool) – whether to force clip input features
frame_period (float) – frame period in msec
f0_shift_in_cent (float) – F0 shift in cent-scale before the inference

Returns:

predicted acoustic features

Return type:

ndarray