Distances (dsptoolbox.distances)

Distances

This module contains distance measurements between signals. Even though the results seem plausible, these implementations need validation from external tools.

Frequency domain:

  • log_spectral()

  • itakura_saito()

Time domain:

  • snr()

  • si_sdr()

Mixed:

  • fw_snr_seg()

dsptoolbox.distances.fw_snr_seg(x: Signal, xhat: Signal, f_range_hz=[20, 10000.0], snr_range_db=[-10, 35], gamma: float = 0.2) ndarray[tuple[Any, ...], dtype[float64]]

Frequency-weighted segmental SNR (fwSNRseg) computation between two signals.

This distance measure divides the signal into auditory frequency bands (using the auditory gammatone filters) and splits the signal in time frames to further compute SNR. This distance was shown to correlate relatively well with results from listening tests and other objective measures. See references for more information.

NOTE: the time window is fixed to be a 75 ms Hamming window with 50% overlap instead as gaussian window (as in the paper) since no length and beta parameter were specified in the publication.

Parameters:
xSignal

Original clean signal. If this signal only contains one channel and xhat more, it is assumed that this channel is the original of all the others.

xhatSignal

Enhanced/modified signal.

f_range_hzarray-like with length of 2, optional

Frequency range in which to analyze the signals. Default: [20, 10e3].

snr_range_dbarray-like with length of 2, optional

SNR range to be regarded. If any frame throws a value outside this range, it is set to the boundary. Default: [-10, 35].

gammafloat, optional

Gamma parameter to be used for the frame weightning. See paper for more information about it. Its recommended range is (according to reference) constrained to [0.1, 2]. Default: 0.2.

Returns:
snr_per_channelNDArray[np.float64]

Frequency-weighted, time-segmented SNR per channel.

References

  • Y. Hu and P. C. Loizou, “Evaluation of Objective Quality Measures for Speech Enhancement,” in IEEE Transactions on Audio, Speech, and Language Processing, vol. 16, no. 1, pp. 229-238, Jan. 2008, doi: 10.1109/TASL.2007.911054.

  • https://ieeexplore.ieee.org/document/4389058

dsptoolbox.distances.itakura_saito(insig1: Signal, insig2: Signal, method: SpectrumMethod = SpectrumMethod.WelchPeriodogram, f_range_hz=[20, 20000], energy_normalization: bool = True, spectrum_parameters: dict | None = None) ndarray[tuple[Any, ...], dtype[float64]]

Computes itakura-saito measure between two signals. Beware that this measure is not symmetric (x, y) != (y, x).

Parameters:
insig1Signal

Signal 1.

insig2Signal

Signal 2.

methodSpectrumMethod, optional

Method to compute the spectrum. Default: WelchPeriodogram.

f_range_hzarray-like with length 2, optional

Range of frequencies in which to compute the distance. When None, it is computed in all frequencies. Default: [20, 20000].

energy_normalizationbool, optional

When True, the observed part of the spectrum is energy-normalized. Default: True.

spectrum_parametersdict, optional

Additional parameters to be used in the computation of spectrum. Pass None to use default parameters in the Signal.set_spectrum_parameters() method. Default: None.

Returns:
distancesNDArray[np.float64]

Itakura-saito measure for the given signals.

References

dsptoolbox.distances.log_spectral(insig1: Signal, insig2: Signal, method: SpectrumMethod = SpectrumMethod.WelchPeriodogram, f_range_hz=[20, 20000], energy_normalization: bool = True, spectrum_parameters: dict | None = None) ndarray[tuple[Any, ...], dtype[float64]]

Computes log spectral distance between two signals.

Parameters:
insig1Signal

Signal 1.

insig2Signal

Signal 2.

methodSpectrumMethod, optional

Method to compute the spectrum. Default: WelchPeriodogram.

f_range_hzarray-like with length 2, optional

Range of frequencies in which to compute the distance. When None, it is computed in all frequencies. Default: [20, 20000].

energy_normalizationbool, optional

When True, the observed part of the spectrum is energy-normalized. Default: True.

spectrum_parametersdict, optional

Additional parameters to be used in the computation of spectrum. Pass None to use default parameters in the Signal.set_spectrum_parameters() method. Default: None.

Returns:
distancesNDArray[np.float64]

Log spectral distance per channel for the given signals.

References

dsptoolbox.distances.si_sdr(target_signal: Signal, modified_signal: Signal) ndarray[tuple[Any, ...], dtype[float64]]

Computes scale-invariant signal to distortion ratio from a target and a modified signal. If target signal only has one channel, it is assumed to be the target for all the channels in the modified signal. See reference for details.

Parameters:
target_signalSignal

Original signal. If it only has one channel and the modified signal has multiple, it is assumed to be the target signal for all channels.

modified_signalSignal

Signal after modification/enhancement.

Returns:
sdrNDArray[np.float64]

SI-SDR per channel.

References

dsptoolbox.distances.snr(signal: Signal, noise: Signal) ndarray[tuple[Any, ...], dtype[float64]]

Classical Signal-to-noise ratio. If noise only has one channel, it is assumed to be the noise for all channels of signal.

Parameters:
signalSignal

Signal.

noiseSignal

Noise.

Returns:
snr_per_channelNDArray[np.float64]

SNR value per channel

References