Vocal-tract areas versus articulatory parameters in speech production modeling

The Journal of the Acoustical Society of America 84, S127 (1988) |

Published by JASA | Organized by Acoustical Society of America

Evaluation of articulatory codebooks [e.g., J. Schroeter et al., J. Acoust. Soc. Am. Suppl. 1 82, S54 (1987)] has clearly shown that improvements can be realized by inserting codewords that reduce access error for certain problematic sounds (e.g., /r/, /l/). Originally, it was not clear whether the needed shapes are attainable with the articulatory model used to convert geometrical parameters of the tract model into area data. Comparing two simple articulatory models did not reveal major differences between them. Both seem to be able to cover the formant space adequately when driven by random parameter values that included formant configurations similar to those of /r/ and /l/. Different optimization algorithms, however, were only moderately successful in finding the global optimum in these cases. The approach taken to overcome this problem was to optimize vocal tract areas directly, thus eliminating any articulatory model. Instead of the geometric constraints inherent in an articulatory model, only loose constraints of specified bounds were imposed on each of the tract sections. These bounds were obtained from histograms of section areas computed from the largest articulatory codebook. In the analysis/synthesis system, area optimization showed superior performance over articulatory model parameter optimization.