Overview
The speech assessment module provides various quality metrics, including naturalness, colouration, noisiness, etc.
Input
Create a folder named data
under your working directory and put your audio files in this folder.
Suported file formats:
wav
Due to algorithm limitation, sub-folder might not work as expected. If your folder with audio files is in
data/speech/
, files indata/speech/audio/
won’t be processed.
Output
BQAT will produce the quality scores generated by the engines as well as additional information in columns. It will be saved as CSV from CLI or JSON from API.
Speech:
Column | Description |
---|---|
quality | Overall quality estimation of the speech audio file. |
noisiness | Quality degradation such as background, circuit, or coding noise. |
discontinuity | Quality degradation caused by isolated or non-stationary distortions, e.g. introduced by packet-loss or clipping. |
coloration | Quality degradation caused by frequency response distortions, e.g. introduced by bandwidth limitation, low bitrate codecs, or packet-loss concealment. |
loudness | Influence of the loudness on the perceived quality of transmitted speech. |
naturalness | Estimation of the naturalness of synthetic speech. |